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(57) Abstract 



The invention relates to infectious cDNA clones. for Dengue 2 virus, strain 16681, and its live, attenuated vaccine derivative, PDK-53 
(DEN-2 PDK-53). The invention also relates to infectious cDNA clones for chimeric viruses characterized as expressing structural genes 
of a Dengue 1, Dengue 3, or Dengue 4 attenuated virus in the context of the nonstructural genes of the Dengue 2 PDK-53 virus (DEN-2/ 1, 
DEN-2/3, DEN-2/4). The invention further relates to genetic constructs encoding these cDNAs, and host cells containing these constructs. 
The invention moreover relates to quadravalent vaccines providing immunity against all four serotypes of dengue virus comprising_DEN-2 
PDK-53 infectious clone derivative, DEN-2/1, DEN-2/3, or DEN-2/4 viruses, and relate " methods of immunization. 
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Field of th* T nven^j r.77 
The invention relates to infectious cDNA clones for 
Dengue 2 virus, strain 16681, and its live, attenuated 
vaccine derivative,, PDK-53 (DEN- 2 PDK-53) . The invention 
also relates to infectious cDNA clones for chimeric 
viruses characterized as expressing structural genes of a 
Dengue 1, Dengue 3, or Dengue 4 attenuated virus in the 
context of the nonstructural genes of the Dengue 2 PDK-53 
virus (DEN-2/1, DEN-2/3, DEN-2/4). The invention further 
relates to genetic constructs encoding these cDNAs , and 
host cells containing these constructs . The invention 
moreover relates to guadravalent vaccines providing 
immunity against all four serotypes of dengue virus 
comprising DEN- 2 PDK-53 infectious clone derivative, DEN- 
2/1, DEN-2/3, or DEN-2/4 viruses, and related methods of 
immunization. 

Background of fha Tnvar.t--.-rvn 
Arthropod-borne viruses (arboviruses) are a diverse 
group of viruses that have been lumped together on the 
basis of their ecological niche, which involves cycles of 
transmission between vertebrate hosts and arthropod 
vectors such as mosquitos and ticks. The prototype 
arbovirus is yellow fever virus, a flavivirus, which was 
isolated in 1927. In the 1950s, the Rockefeller 
Foundation established a number of field stations j_n 
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various tropical countries for the purpose of isolating 
new viruses.. The' 1985 International Catalogue of 
Arboviruses Including Certain Other Viruses of Vertebrates 
contains registrations for 504 discrete arboviruses, 124 
5 of which have caused disease in humans. Thirty-four 

viruses of the Flavivirus genus (family Flaviviridae) of 
arboviruses are human pathogens (Karabatsos, 1985) . (All 
publications cited hereunder are incorporated herein by 
reference . ) 

10 According to a 1992 World Health Organization (WHO) 

press release (Press Release WHO/74, November 24, 1992), 
dengue hemorrhagic fever is one of the most important and 
. increasing mosquito-transmitted infections in the world, 
.with more than 85 countries in Asia, the Pacific Islands, 

15 Africa, Central America, and South America being 

threatened with dengue outbreaks. Dengue fever was known 
in the past as "breakbone fever 11 due to the severe 
muscular and joint pain that accompanied the high fever 
during this infection. Dengue, is an under-reported 

20 disease: it is thought that millions of cases occur each 
year. 

Dengue (DEN)- viruses, which are f laviviruses , are 
classified ant igenically into 4 serotypes (DEN- 1 , DEN- 2 , 
DEN- 3 , and DEN- 4 ) . Multiple serotypes are now endemic in 

25 most countries in the tropics. DEN viruses are 

transmitted to humans principally by Aedes aegypti 
mosquitos throughout much of the tropical and subtropical 
region of the world: Viruses of all four serotypes infect 
humans and cause clinically inapparent infection or 

3 0 illness ranging from dengue fever to severe and often 
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25 



fatal dengue hemorrhagic fever/dengue shock • syndrome 
(DHF/DSS) . ■ DHF/DSS has been associated epidemiological^ 
and experimentally with immune enhancement of virus 
replication by preexisting, subneutralizing levels of 
5 heterotypic antibody. About 90% or more of patients with 
DHF/DSS are children who are 14 years old or younger 
(Halstead, 1970/ Halstead, 1988). Case fatality rates in 
untreated individuals can be as high as 15-20%. Between 
1956 and 1978, hospitalization of more than 350,000 dengue 
10 patients and about 12,000 deaths in Southeast Asia were 

reported to the WHO (Halstead, 1980). More recent dengue 
epidemics in Asia, the Pacific islands, the Americas, and 
Africa indicate that the incidence, with up to 40 million 
cases annually, and geographic distribution of the disease, 
is increasing in Aedes aeqypti- infested areas of the world 
(Halstead, 1984; Gubler, 1988; Brandt, 1990). 

Since eradication of Aedes aegypti mosquitos appears 
to be practically infeasible, development of safe, 
effective vaccines against all four serotypes of DEN virus 
is a WHO priority (Gubler, 1988; Brandt, 1988; Brandt, 
1990). Since the level of DEN virus replication in 
certified cell cultures yields insufficient antigenic mass 
to produce effective inactivated vaccines; -priorities are 
given to developing effective live, attenuated vaccine 
viruses and using a variety of expression systems such as 
recombinant vaccinia or avipox virus (live vaccine),, 
recombinant bacul.ovirus (subunit vaccine), and recombinant 
E. coli (subunit vaccine) to express certain genes of the 
DEN viral genome (Brandt, 1988; Brandt, 1990). 



15 



20 
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Flaviviruses are enveloped RNA viruses 4 5 to 5 0 nm in 
diameter that contain a single-stranded, positive- sense 
capped RNA genome of approximately 11 kb. ■ The RNA genome 
does not have a 3 1 -terminal poly (A) tail. Because the 
5 genetic molecule of flaviviruses is positive or messenger 
RNA (mRNA) -sense, naked genomic RNA injected, transfected, 
or electroporated into mammalian or invertebrate cells is 
capable of associating directly with the ribosomal protein 
synthetic machinery of the .cell. All of the viral 

1.0 proteins are translated from the inserted viral genomic 
mRNA. These virus - specif ied proteins then replicate the 
viral genome resulting in intracellular virus maturation 
and release of infectious "virus from the trahsfected cell. 
The gene organization of the flavivirus mRNA genome, 

15 illustrated below, is 5'-noncoding region (5 1 -N.C) - capsid- 
premembrane /membrane (prM/M) -envelope (E) -nonstructural 
protein 1 (NSl) -NS2A-NS2B-NS3 -NS4A-NS4B-NS5-3 1 -noncoding 
region (3 f -NC). The structural proteins capsid, prM/M, 
and E and nonstructural proteins are translated as a large 

20 precursor polyprotein molecule from a single long open 

reading frame in the mRNA genome. The individual mature 
viral proteins are processed from the polyprotein by both 
cell and virus specified proteases (Westaway et al . , 1985; 
Coia et al . , 1988; Speight and Westaway, 1989; Rice et 

25 al . , 1985) . 



Genome Organization of Dengue Virus and Other Flaviviruses 
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The structural proteins are those viral proteins that 
are incorporated into the mature virion. The virion 
consists of an icosahedral capsid (C) that packages the 
viral genomic mRNA (nucleocapsid) . The nucleocapsid is 
5 surrounded by a cell-derived lipid membrane into which the 
envelope (E) and mature membrane (M) proteins are 
imbedded. The virus- specif ic nonstructural genes, NS1- 
NS5, are expressed in the cytoplasm of the infected cell 
and are involved in the replication and maturation of the 
10 viral RNA genome and viral proteins. 

The E glycoprotein of the virus is exposed to the 
environment and is involved in attachment and. entry of the 
virus into the cell. The E protein is the primary viral 
immunogen against which the infected vertebrate host 
15 develops virus-specific neutralizing antibody. The E gene 
is the most common target for development of molecular 
systems to express the encoded E glycoprotein. However, 
immunization with various purified nonstructural genes of 
• the virus have been shown to elicit protective immunity 

2 0 against challenge with wild- type virus, probably via 

cytotoxic T-cell mediated lysis of infected cells which 
express viral nonstructural proteins on the cell surface. 

Vaccination can be one of the most cost effective 
ways to prevent, dengue fever and DHF/DSS . ■ Since 1979 the 
25 WHO has supported research on dengue vaccine development 
at the Mahidol University in Bangkok, Thailand. (Press 
Release WHO/74, November 24, 1992) . . Investigators at 
Mahidol University have developed four live, attenuated 
candidate vaccine viruses, one for each of the four 

3 0 serotypes, by serial passage of the virulent parent 
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viruses in primary dog kidney (PDK) or fetal rhesus lung 
( FRhL ) cell culture (Yoksan et al . , 1986/ Bhamarapravati 
et.al., 1987).. Phase 1 and Phase 2 clinical trials in 
Thailand have, demonstrated that the vaccine is both safe 
5 and immunogenic in humans. The vaccines now need to be 
tested for efficacy in large numbers of children (Press 
Release WHO/74, November 24, 1992) . To preclude the 
possible severe DHF/DSS immune enhancement phenomenon in 
vaccinees who might be infected naturally with a 
10 heterologous serotype of . wild-type DEN virus following 

immunization with a single serotype of vaccine virus, it 
is essential that humans be vaccinated with a guadravalent 
vaccine to provide immunity against all four serotypes of 
the virus. 

.5 

Summary of the Invention 
The invention provides a guadravalent vaccine 
providing immunity against all four serotypes of dengue 
virus comprising a DEN- 2 PDK-53 infectious clone-derived . 
0 virus . 

The invention also provides a qu'adravalent 
vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN-2/1 virus. 

The invention further provides a guadravalent- 
5 vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN- 2/3 virus. 

The invention moreover provides a quadravalent 
vaccine providing immunity against all four serotypes of 
dengue virus comprising a chimeric DEN- 2/4 virus. 
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The invention additionally provides a 
quadravalent vaccine providing immunity against all four 
serotypes of dengue virus comprising DEN- 2 PDK-53 
infectious clone-derived 'and chimeric DEN-2/1, DEN-2/3, 
and DEN-2/4 viruses.. 

In another aspect, the invention provides a 
method of immunization in which a desired immune response 
is produced against all four serotypes of dengue virus 
comprising the step of administering to a subject a 
quadravalent vaccine comprising DEN- 2 PDK-53 infectious 
clone-derived and chimeric DEN-2/1, DEN-2/3, and DEN-2/4 
viruses . 

In yet another aspect, the invention provides a 
composition of matter comprising a full genome- length 
15 infectious cDNA clone for a DEN -2 virus, strain 16681. 

The invention also provides a composition of 
..matter comprising a full genome -length infectious cDNA 
clone for a DEN- 2 virus of a strain characterized as 
replicating to high titer in cell culture. 
20 Tiie invention further provides a composition of 

matter comprising a full genome-length infectious cDNA 
clone for a DEN- 2 virus, strain 16681, having the 
identifying characteristics of ATCC 69826. 

In still another aspect, the invention provides 

2 5 a composition of matter comprising a full genome -length 

infectious cDNA clone for a DEN- 2 virus, strain 16681, 
attenuated derivative, PDK-53. 

The invention also provides a composition of 
matter comprising a full genome -length infectious cDNA 

3 0 clone for a DEN- 2 virus attenuated derivative, 
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characterized as replicating to high titer in cell 
culture. - 

The invention further provides a composition of 
matter comprising a full genome -length infectious cDNA 
5 . clone for a DEN -2 virus, strain 16681, attenuated 

derivative, . PDK- 53 , having the identifying characteristics 
of ATCC 69825. 

In another aspect, the invention provides a 
composition of matter . comprising a full genome- length 

10 infectious cDNA clone of a chimeric DEN-2/1 virus, wherein 
the virus is characterized as expressing the prM and E 
genes of a DEN-1 attenuated virus in the context of the 
nonstructural genes of the DEN -2 PDK- 53 virus. The DEN-1 
attenuated virus may be DEN-1 PDK-13. 

15 The" invention also provides a composition of 

matter comprising a full genome-length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
characterized as expressing the antigenicity of a DEN-1 
attenuated virus. 

20 in yet another aspect, the invention provides a 

composition of matter comprising a full genome -length 
infectious cDNA clone of a chimeric DEN- 2/3 virus, wherein 
• the virus is characterized as . expressing the prM arid E 
genes of a DEN- 3 attenuated" virus in the context of the 

25 nonstructural genes of the DEN- 2 PDK-53 virus. .The DEN -3 
attenuated virus may be DEN- 3 PGMK3 0 / FRhL - 3 . 

The invention also provides a composition of 
matter- comprising a full genome-length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
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characterized as expressing the antigenicity of a DEN- 3 
attenuated virus. 

In still another aspect, the invention provides 
a composition of. matter comprising a full genome- length 
5 infectious cDNA clone of a chimeric DEN-2/4 virus, wherein 
the- virus is characterized as expressing the prM and E 
genes of a DEN -4 attenuated virus in the context of the 
nonstructural genes of the DEN -2 PDK-53 virus. The DEN -4 
' attenuated virus may be DEN- 4 PDK-48. 

10 The invention also provides a composition of 

matter comprising a full genome-length infectious cDNA 
clone of a chimeric DEN- 2 virus, wherein the virus is 
characterized as expressing the' antigenicity of a DEN-4 
attenuated virus. 

15 Additionally, the invention provides a genetic 

construct comprising a DNA sequence operably encoding the 
polyprotein of DEN- 2 virus, strain 16681. The polyprotein 
may be the polyprotein encoded by the nucleotide sequence 
of SEQ ID NO:l. 

20 The invention . also provides a genetic construct 

comprising a DNA sequence operably encoding at least one 
• protein of DEN- 2 virus, strain 16 681. The protein may be 
a protein encoded by the nucleotide sequence of SEQ ID NO: 
1 . 

25 Further, the invention provides a genetic 

construct comprising a DNA sequence operably encoding the 
polyprotein of DEN-2 virus, strain 16681, attenuated 
derivative, PDK-53. The polyprotein may be the- 
polyprotein encoded by the nucleotide sequence of SEQ ID 

30 NO: 2 . 



WO 96/40933 



PCT/US96/09209 



10 

The invention also provides a genetic construct 
comprising a DNA sequence . operably encoding at least one 
protein of DEN- 2 virus, strain 16681, attenuated 
derivative, PDK-53. The protein may be a protein encoded 
5. by the nucleotide sequence of SEQ ID NO: 2. 

Moreover, the invention provides a genetic 
construct comprising a DNA sequence .operably encoding at 
least one structural protein of DEN - 1 PDK-13 . The 
structural protein may be a structural -protein encoded by 
10. the nucleotide sequence of SEQ ID NO: 124. 

In another aspect, the invention provides a 
genetic construct comprising a DNA sequence operably 
encoding at least one structural protein of DEN- 3 
PGMK3 O/FRhL-3 . The structural protein may be ■ a structural 
15 protein encoded by the nucleotide sequence of SEQ ID NO: 
125. 

In still another aspect , the invention provides 
a genetic construct comprising a DNA sequence operably 
encoding at least .one structural protein of DEN-4 PDK-48. 
20 The structural protein may be a structural protein encoded 
by the nucleotide sequence of SEQ ID NO: 126. 

In yet another aspect, the invention includes a 
host, cell comprising any of the above genetic constructs. 

- 5 Brief Description of the Dravingg 

• ■ Figure 1: Strategy for construction of the full 
genome-length cDNA clone of DEN- 2 virus. Using PCR 
technology, cDNA is amplified from the genomic RNA of the 
virus and cloned. Subclones are spliced together at 

>0 unique, overlapping restriction enzyme sites to construct 
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the full genome- length clone. Numbered arrows upstream 
(right arrows) and downstream (left primers used to 
amplify the cDNA in PCR reactions. 

.. Figure 2: Transcription of genomic mRNA from the 
5 full-length infectious cDNA clone of DEN- 2 virus. The 

recombinant plasmid is linearized at the unique Xbal site 
at the 3' -end of the genomic cDNA. Bacteriophage T7 RNA 
polymerase recognizes the T7 promoter engineered at the 
5 ' -end of the cDNA and transcribes full-length viral mRNA ' 
10 from the cDNA template. 

Figure 3: Restriction enzyme sites identified in the 
nucleotide sequence of the RNA genome of DEN -2 166 81 
virus. Locations for the sites are indicated by the 
genome nucleotide numbers. Restriction enzymes that 
15 cleave the DEN- 2 genomic cDNA at only a single location 
are listed vertically at the top of the figure. The 
resolution of the RENZ graph is 97.5 nucleotides per dot. 

Figure 4: Growth curve of DEN- 2 16681 virus in C6/36 
mosquito cells. 
20 Figure 5: (A) Polaroid prints showing RT/PCR 

amplification of the entire mRNA genome of DEN- 2 virus, 
strain 16681, in the form of 5 cDNA amplicons. The 
^ .molecular weight marker (MW) consists of linear, double- 
stranded DNA markers ' of various' base pair (bp) lengths. 
25 The top 2 gels show 5-/xl aliquots of the original RT/PCR 
• reactions. The bottom two gels show 10% of the yield 
following HMC agarose gel purification of the remaining 
95 -/xl reaction aliquots. (B) Primers (amplimers) used in 
the RT/PCR reactions and the expected sizes of the 
30 resulting cDNA amplicons. 
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.Figure 6: EcoRI restriction enzyme digests of F2, 
F2-Sal, Sal-F2, and F3 miriiprep recombinant plasmid DNA . 
Plasmids from individual colonies resulting from 
transformation with independent ligated, recombinant 
5 .plasmid molecules are numbered. The insert in the. single 
F2-8 plasmid was too small and was discarded.: The 
remaining recombinant plasmids contained cDNA inserts of 
expected size. As expected, F2-Sal cDNA contained two 
.internal EcoRI sites; the Sal-F2 and F3 plasmids contained 
10 a single internal EcoRI site. EcoRI digestion of the 
recombinant plasmids regenerated linearized, wild- type 
3.9-kb.pCRII vector. For an undetermined reason, one of 
the EcoRI sites in plasmid F3-I did not cut. 

Figure 7: Schematic diagram showing the genomic 
15 locations of DEN- 2 16681 virus-specific cDNA clones. 

Clones indicated with asterisks were spliced together at 
the indicated restriction enzyme sites to construct the 
full genome-length cDNA clone. Black horizontal bars 
indicate clone regions that were sequenced. Light gray 
2 0 regions of horizontal bars indicate clone regions that 
were not sequenced. 

Figure 8: {A) Effect of adding Taq extender reagent 
to PCR reactions. The 5.2-kbp amplicon of St. Louis 
encephalitis virus was readily obtained by extended PCR 
25 (+) but not by standard PCR (-) . (B) Agarose gel - 

electropherogram showing DEN- 2 PDK-53 Fl,- F2 , and F3 . 
amplicons derived by extended PCR. 

Figure 9 : Schematic diagram showing the genome 
locations of errors identified in the cDNA clones of DEN- 2 
*0 16681. Errors are indicated by short vertical tick marks. 
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Figure 10: Schematic diagram illustrating the 
approximate genome locations of the nucleotide 
discrepancies between the data of Applicants and those of 
Blok et- al. (1992) for the sequence of the genome of DEN -2 
virus, strain 16681. .. 

Figure 11: Nucleotide sequence of the genome of DEN- 
2 strain 16681 virus. Differences between the data 
determined by Blok et al. (1992) (DEN-2 -16681 . BLOK) and 
those obtained by Applicants (DEN- 2 -166 81 . RK) . The genome 
nucleotide positions of the sequence differences are 
listed vertically. The solid squares indicate those 
nucleotide differences that also encode amino acid 
substitutions. The remaining nucleotide differences are 
either silent, encoding the same amino acid, or lie within 
15. the 5'-noncoding (5 1 -NO or 3 ' -noncoding region (3--NC)..- 
Figure 12: Schematic diagram showing the DEN-2 PDK- 
53 virus-specific cDNA clones and the approximate, 
locations of cDNA errors (vertical tick marks) identified 
by nucleotide sequence analyses. Clones marked with an 
asterisk were used in the construction of the DEN-2 PDK-53 
virus-specific full-length cDNA clone. . Clone #19 had a 
203-bp deletion (horizontal line) . 

Figure 13: Schematic summary of the DEN-2 16 681 vs. 
PDK-53 virus sequencing projects. Arrows indicate the 
nucleotide differences detected between the two genomes. 
Triangles indicate those nucleotide changes that resulted 
in amino acid substitutions. 

Figure 14: Finalized nucleotide and amino acid 
sequence of the RNA genome of DEN-2 virus, strain -2.6681 
30 (SEQ ID NO:l). The nucleotide and amino acid mutations 
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that were determined to have occurred in DEN- 2 virus, 
strain PDK-53, are -indicated at the appropriate positions 
■ (SEQ ID"NO:2). The EcoRI, SstI, Mull, and- T7 promoter 
, sites that were engineered immediately preceding the. 5 ■ -. 
5 terminal nucleotide of the virus-specific genomic cDNA are 
shown.' The start positions of the viral genes and : 
noncoding regions (5'-NC and 3 ' -NC) are shown. Potential 
sites of Asn- linked glycosylation (Asn-X-Ser or Thr, where 
X = any amino acid) in prM, E, .and NS1 are indicated by 

10 asterisks. The deduced amino acid sequence is indicated 

in standard single^letter abbreviation: a" = Ala, C = Cys, 
D = Asp, E = Glu, F = Phe, G = Gly,. H = His, I = lie, K = 
Lys, L = Leu, M = Met, N = Asn, P = Pro,-Q = Gin, R = Arg, 
S = Ser, T = Thr, V = Val, W = Trp, Y = Tyr . 

15 Figure 15: Construction of intermediate clone F2 by 

ligating the F2-Sal Sphl/Hpal fragment and Sal-F2 
Hpal/Kpnl fragment into pUC18. The resulting F2 clone 
contained a nonsilent cDNA error at genome nucleotide 
position 1730. 

2 0 Figure 16: Correction of the intermediate F2 clone, 

A new PCR amplicon was cloned and sequenced. The 
Sphl/Hpal fragment of this clone was spliced into F2 to 
construct F2-C having the correct nucleotide at genome 
position 1730. 

25 . Figure 17: Construction of the intermediate Fl/3/4/5 

cDNA clone for DEN- 2 166.81 virus. The thick solid black 
bars indicate DEN -2 virus -specif ic cDNA, illustrated with 
the RENZ sites of the MCS of the plasmid. The RENZ sites 
used in each step of the splicing strategy are indicated 

30 in underlined, bold characters. The top half of thie 
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figure shows construction of Fl/3/4/5-pUClB . The bottom 
portion of the figure illustrates the making of Fl/3/4/5- 
PUCISK The final step in the construction of the full 
genome -length cDNA clone involved the ligation of the F2-C 
5 Sphl/Kpnl cDNA fragment into plasmid containing cDNA 

Fl/3/4/5 and cut with RENZs Sphl/Kpnl . Although F2-C cDNA 
could not be cloned into Fl/3/4/5-pUC18 , it was readily 
cloned into Fl/3/4/5- P UC19 . The pUCIS plasmid containing 
a small insert of cDNA made for '. Venezuelan equine 
encephalitis (VEE) virus was used simply to move Fl and 
F4/5 into pUC18 in a. 3-molecule ligation reaction. The 
VEE virus -specific cDNA was spliced out during this 
process. Arrowheads under cDNA bars indicate orientation 
of mRNA- sense cDNA strand. 

Figure 18: Orientation specific cloning of full 
genome-length cDNA of DEN -2 16681 virus into the multiple 
cloning site of pUC19. Although the full-length cDNA was 
readily cloned in pUC19, multiple attempts to insert the 
cDNA into pUC18 failed. Presumably, interaction of the 
cDNA with pUCIS -specific gene transcripts, translation of 
a toxic DEN-2 polypeptide, or translation of a toxic 
PDC18/DEN-2 fusion polypeptide produced deleterious 
effects in E. coli. Large arrows indicate orientation of 
mRNA-sense cDNA strands in the pUC plasmid backbone. 
2 5 Smaller arrows indicate orientations of the lac Z and 
ampicillin genes as well as the origin of replication. 
DEN-2 insert is indicated by a thick solid black line. 

Figure 19: Insertion of the MCS of plasmid pUC19 
into pBR322 in both orientations to construct pBRUC-138 
30 and pBRUC-139. The P UC18 Hindlll (blunt-ended = BL) /EcoRI 
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MCS fragment was ligated into ,pBR322 cut with Aval 
(BL)/EcoRI to construct pBRUC-138. The pUC18 EcoRI 
(BL) /Hindlll MCS fragment was. ligated into pBR322 cut with 

Aval (BL) /Hindi II to make pBRUC- 139 . In both cases, the 

* - • 

5. tetracycline- gene of pBR322 was removed. pBRUC-138 = 
2992-bp (61-bp MCS + 2931-bp pBR322 deletion vector). 
pBRUC-139 = 3022-bp (61-bp MCS + 2961-bp pBR322 deletion 
vector).. Orientations of ORI, ROP, and the Amp gene are 
indicated. 

10 Figure 20: Construction of pD2/IC-30P, the full 

genome-length cDNA clone of DEN- 2 16681 virus/ in plasmid 
pBR322 (pBRUC-139 (SphI-) derivative) . The F3/4/5 clone 
cDNA was ligated into pBRUC-13 9 first (Top of Figure) , 
followed by Fl-E and F2-C. Viable, infectious DEN -2 virus 

15 was successfully obtained from viral mRNA transcribed from 
this clone. 

Figure 21: Construction of pD2/IC-13 0V, the full 
genome-length cDNA clone of DEN- 2 PDK-53 virus. A 
nonsilent error in cDNA clone F3-3C was corrected by 

20 splicing in a correct BstBI/Nhel fragment from clone F3.5- 
6 (Top) . The resulting corrected clone F3-3CC was spliced 
into the 16681 F345-F clone in pBRUC-139. cDNA fragments 
F1-79B, F2-16B, and the recombinant F3/4/5 vector DNA were 
spliced together in a single ligation reaction to produce 

25 pD2/IC-13 0V. The Nhel site occurs at genome nucleotide 

position 6646. Therefore, the PDK-53 virus -specific full- 
length cDNA clone contains the parental 166 81 virus- 
specific nucleotide at position 8571. This nucleotide 
difference is silent; it does not encode an amino acid 

3 0 change. Other than the 8571 position, DEN- 2 16 681. and 
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PDK-53 viruses are identical in nucleotide sequence from 
nucleotide position 6646 to the 3' terminus of the genome. 

Figure 22 : Agarose gel electropherogram of viral 
genomic mRNA extracted from gradient -purified, ; wild- type 
5 DEN- 2 166 81 virus and Venezuelan equine encephalitis (VEE) 
virus. The quantity of RNA loaded onto the gel ranged 
from 22 ng to 383 ng . The stock RNA. was quantitafced 
spectrophotometrically at 260 nm. The genome- length RNA 
band is clearly visible between the 4153 -bp and 6788 -bp MW 

10 marker bands. Bands were visualized by incorporating 200 
ng/ml of ethidium bromide stain in the gel and 
electrophoresis buffer. . 

Figure 23: Transcription of RNA from pVE/IC-92 (VEE 
virus clone) and pD2/IC-20 (DEN- 2 16681 virus clone) . 

15 Transcription reaction conditions (100 ng linearized DNA 
template, 12.5 mM DTT, 2.7 u//xl RNasin, 0.15 mM NTPs , 3.3 
TJ/[il T7 RNA polymerase (Stratagene) in commercial buffer 
(Stratagene) ) yielded high quantity and quality of 
infectious. mRNA transcripts from the pVE/IC-92 clone and 

20 3 1 -end truncation products of that clone. However, these 
reaction conditions failed to permit transcription of RNA 
from the pD2/IC-20 clone or two of its 3' -end 
'transcription products .(clone linearized at the Nsil or 
Mrol site instead of at the 3' -terminal Xbal site) . 

25 pVE/IC-92 . plasmid linearized at the Mlul (3 ' - terminal) , 

S£hl, Tthllll, Hindlll, Sail, and StuI sites in tine cDNA 
clone yielded RNA transcripts of 11447, 11377, 7541, 2407, 
1620, and 674 base length, respectively (the more intense, 
prominent bands in these gel lanes) . 
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Figure 24: Transcription of RNA from the DEN- 2 16681 
cDNA clone pD2/IC-20. (A) Transcription of RNA using 
different quantities of linearized plasmid template (a,b) . 
The cap analog m7G ( 5 1 ) ppp (5 1 ) A was not included in the 
5 reaction. (B) Transcription of 5' -capped RNA with 

inclusion of cap analog in the reaction. . Transcription 
was accomplished with the Ampliscribe transcription kit 
from Epicentre Technologies. T7 pol = bacteriophage T7 
RNA polymerase. • 

10 Figure 25: Transcription of full genome -length, 

infectious viral mRNA from Xbal-linearized DEN- 2 16681 
plasmid pD2/TC-30P (A and D replicate clones resulting 
from independent 'bacterial colonies transformed with the 
recombinant pBRUC/DEN-2 plasmid) and PDK-5 3 plasmid 

15 . pD2/IC-130V (F andr J replicates) . Genomic "viral RNA" 
extracted from gradient-purified wild- type DEN- 2 16681 
virus was electrophoresed in lanes 2 and 10. Aliquots of 
transcription reactions sampled before (T7 RNA polymerase 
) and after (T7 -Pol— "-+-"-) — additi-e-n^-of— -T7— RNA^po^lymerase 

2 0 are shown. Only the linearized plasmid DNA template is 
observed in the absence of the polymerase. 

Figure 26: • Transcription of RNA from pD2/IC-20, 
pD2/IC-30P, and pD2/lC-130V in the presence or absence of 
T7 RNA polymerase or cap analog in the transcription 

2 5 reaction. All lanes shown are on a single gel. 
Transcription was performed with the Ampliscribe 
transcription kit . 

Figure 27: Derivation tree for the construction of 
the DEN -2 16681 and PDK-53 virus-specific full genome- 

30 length cDNA clones pD2/IC-30P and pD2/IC-130V, 
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respectively, and chimeric 16681/PDK-53 clones derived 
from the two prototype . clones . 

Figure 28: Genotype maps of DEN- 2 16681 and PDK-53 
virus-specific full genome-length cDNAs and their chimeric 
derivatives. The scale at the top indicates relative 
genome nucleotide position in thousands . The graph 
resolution is 119.1444 bp/dot. cDNA regions contributed 
by the parental DEN -2 16681 virus are indicated by solid 
black bars. Regions derived from the DEN- 2 PDK-53 vaccine 
virus are indicated by stippled bars. The 8 mutations 
identified by sequence analyses of the genomes of the 
16681 and PDK-53 viruses- are indicated. The virus- 
specific 5 -noncoding nucleotides are indicated in lower 
case characters. The amino acids encoded by the virus - 
IS specific nucleotide mutations in the protein coding region 
of the genome are indicated in upper case, single- letter 
amino acid abbreviation. 

Figure 29: Results of spot -sequencing PCR amplicons 
amplified from seed stocks of viruses derived from full 
20 genome-length cDNA clones. Dots indicate nucleotide 

sequence identity to the DEN- 2 16681 virus. The expected 
virus-specific nucleotides for the genotype of each virus 
are shown. Those nucleotide positions that have actually 
been confirmed by sequence analysis are indicated by 
25 underlined nucleotide base characters. The actual genome 
nucleotide positions are indicated at. the bottom of the 
Figure . 

Figure 30: Recombinant full-length pD2/IC-3 0P-A and 
PD2/IC-130V-F plasmids extracted from 1-ml aliquots of E . 
30 coli TB-1 cultures submitted to ATCC. 
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• Figure 31: Partial nucleotide sequences of candidate 
vaccine viruses : 

DEN-1 16007 PDK-13 (Dl.VAC) (SEQ ID NO: 124) 

DEN- 2 16681 PDK-53 (D2.VAC) ( see SEQ ID NO: 2) 

.5. DEN- 3 16562 PGMK-3 O/FRhL-3 (D3 .VAC) {SEQ ID NO: 125) 

DEN- 4 1036 PDK-48 (D4.VAC) (SEQ ID NO: 126) 

aligned with the nucleotide and deduced amino acid 
sequences of DEN -2 16681 virus. ( see SEQ ID N0:1) . Dots in 
the* DEN-1, DEN- 3 , and DEN-4 sequences signify identity 

10 with the DEN- 2 sequence. 

Figure 32: Partial amino acid sequences of ' candidate 
vaccine viruses: -■ 
DEN-1 16007 PDK-13 ' (Dl.VAC) (SEQ ID- NO: 124) 

DEN- 2 16681 PDK-53 (D2.VAC) ( see SEQ ID NO: 2) 

15 DEN- 3 16562 PGMK-3 O/FRhL-3 (D3.VAC) (SEQ ID NO : 125) 

DEN-4 1036 PDK-48 (D4.VAC) (SEQ ID NO: 126) 

aligned with the deduced amino acid sequence of DEN -2 
16681 virus ( see SEQ ID NO:l). Dots in the DEN-1, DEN- 3 , 
and DEN-4 sequences signify identity with the DEN- 2 

2 0 sequence. 

Figure 33: Mutagenesis analysis of the 5 1 end of the 
prM gene. The 447-452 sequence ("AACCAC" in DEN- 2 ) can be 
mutated to " GTCGAG " in all four DEN viruses to create a 
Xhol site for cassette splicing. This modification 
25 results in conservative Thr-Thrto Ser-Ser substitutions 
at amino acid positions prM 4-5 in DEN- 2 virus. By 
creating this Xhol site, all four viruses will contain the 
sequence FHL»SSR at amino acid positions prM 1-6 ( s ee 
•Figure 3 2) . Nucleotide mutations that are necessary to 

3 0 create the Xhol site are indicated by bold, underlined 
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characters in the nucleotide sequences of D2.VAC, Dl.VAC/ 
D3.VAC, and D4.VAC and their respective primers designed 
for amplification in PCR. 

Figure 34: Mutagenesis analysis, of the 3' end of- the 
5 E gene. The 2344-2349 sequence ("TCACGC" in DEN- 2) can be 
• mutated to 11 TCTAGA" in all four DEN viruses to create a 
Xbal site for cassette splicing. This modification 
results in no amino acid change in DEN- 2 at this site, but 
substitutions do occur in -the other • three viruses. By 
10 creating this Xhol site, all four viruses will contain the 
sequence SRS at amino acid positions E 470-472 ( see Figure 
3 2).. Nucleotide mutations that are necessary to create 
. the Xbal site are indicated by bold, underlined characters 
in the nucleotide sequences of D2.VAC, Dl.VAC, D3 .VAC, and 
15 D4.VAC and their respective primers designed for 
amplification in PCR. 

Figure 35: Construction of DEN- 2 PDK-53 cassette 
plasmids pFl-Xho and pF2-Xba. (A) pFl-Xho: Clone PCR cDNA 
amplicons Fl-prM5 r and Fl-prM3 ! into TA-vector. Sequence 

2 0 and splice correct clones together at the SphI site in the 

TA-vector to' construct pFl-prM53 (not shown) . Subclone 
the prM53. cDNA into Sstl/SphI -cut pFl-E ( see Figure 20) to 
construct pFl-Xho. (B) pF2-Xba: Clone PCR cDNA amplicons 
F2-E5' and F2-E3* into TA-vector. Splice correct clones 
25 together at the Xbal site in the TA-vector -to construct 
PF2-E53 (not shown) . Subclone the Sphl/Hpal E53 cDNA 
fragment into pF2-lGB ( see Figure 21) , which itseXf is 
subcloned into pBRUC-139 between the SphI /Kpnl • sites (not 
shown), to -construct pF2-Xho. PCR amplimer designations 

3 0 are underlined. Solid black bars indicate newly 
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synthesized and sequence-characterized cDNA. - Stippled bar 
indicates previously synthesized cDNA. Graph resolution = 
64.1857 nucleotides/dot . 

Figure 36 : Construction of chimeric plasmids 
5 containing the prM and E genes (Xhol-Xbal cDNA fragment) • 
•of DEN-1 , DEN- 3 , or DEN- 4 candidate vaccine virus within' 
the genetic background of DEN -2 PDK-53 virus. pD2V-CAS12 
was constructed by ligating the Sstl/SphI fragment of pFl- 
Xho and Sphl/Kpnl fragment of pF2-Xba ( see Figure 33) into 

10 a truncated form of pD2/IC-130V ( see Figure 21) . pD2/IC- 
13 0V was truncated by restricting the full-length clone at 
the NsiI-4696 and 3 1 -end Xbal sites, blunt-ending with*T4 
DNA polymerase,, and religating. This procedure removed 
genome nucleotides 4696-10723, thereby removing the Xhol- 

15 5426 and 3 1 -end Xbal sites, which would otherwise 

interfere with construction of chimeric plasmid cassettes 
using Xhol and Xbal sites. The "cassette strategy employs 
. PCR amplification of DEN- 1 , DEN- 3, and DEN- 4 cDNAs 
containing the prM and E genes; cutting the amplicons with 

2 0 Xhol/Xbal; cloning resulting fragments into pD2V-CAS12 to 
construct pDlV-CAS12, pD3V-CAS12, and pD4V-CAS12 chimeric 
cassettes; confirming the chimeric Xhol/Xbal insert by 
nucleotide sequence analysis; and then subcloning the 
Sstl/Kpnl fragment of the chimeric cassette into pD2/IC- 

25 130V to construct the chimeric full genome-length cDNA 

clones from which chimeric DEN-2/1, -2/3, and -2/4 viruses 
are derived. The genetic background of DEN- 2 PDK-5 3 virus 
is illustrated by the solid black bars. The heterologous 
DEN-1, DEN- 3 , and DEN -4 cDNA inserts are indicated by the 

30 stippled bars. The pBRUC-139 plasmid backbone is not 
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illustrated for pD!V-CAS12, pD3V-CAS12, or pD4V-CAS12 
chimeric plasmid. Resolution = 110.5464 bp/dot. 

Detailed Desg^ption g £ the Invention 
We developed a quadravalent vaccine by initially 
constructing a full genome- length infectious cDNA clone 
for DEN- 2 virus. We chose serotype 2 of DEN virus because 
virus strains of this ■ serotype generally replicate to high 
titer in cell culture. We chose to develop an infectious 
clone for the 16681 strain of DEN -2 virus because the 
candidate vaccine viruses developed by Mahidol University 
are currently the best live, attenuated vaccina virus- 
candidates in terms of immunogenic efficacy and lack of 
reactogenici-ty in vaccinees . ' We developed an infectious 
15 cDNA clone of the 1-6681 strain, which is the parent to the 
DEN- 2 PDK-53 candidate vaccine virus developed at Mahidol 
University, to permit engineering of second and later 
generation live, attenuated DEN vaccine viruses* 

The infectious clone strategy was initiated with the 
20 virulent parental 16681 strain obtained from the Division 
of Vector-Borne Infectious Diseases (DVB ID) of the Centers 
for Disease Control and Prevention (CDC) virus collection. 
■ We synthesized cDNA from the DEN -2 16681 viral RNA. ' The 
immediate objective was to obtain an accurate full genome - 
25 length infectious cDNA clone of the 16681 strain of DEN -2 
virus, since it was essential to develop a reliable 
experimental system to permit, routine genetic engineering 
of the cDNA and recovery of virus . Our approach Involved 
using polymerase chain reaction (PCR) technology to create 
3 0 cDNA clones that could be spliced together to construct a 



WO 96/40933 PCT/US 96/09209 

24 

single full genome -length clone (Figure 1) from which 
full-length, infectious DEN- 2 genomic mRNA could be 
transcribed (Figure 2) . 

The first full-length sequence- characterized cDNA 
5 clone, designated pD2/IC-20, was constructed in the high 
copy number pUC19 plasmid vector. Successful 
transcription of genome -length DEN- 2 166 81 viral R-NA from 
pD2/IC-20 was clearly demonstrated by agarose gel 
electrophoresis of the transcription reaction product. 

10 However, RNA transcribed from this particular clone failed 
to yield infectious virus. It was determined that cDNA 
errors had occurred during the clone manipulations. We 
then decided to reconstruct the full-length clone in the 
low copy number pBR3 22 plasmid. The full-length cDNA of 

15 DEN- 2 16681 virus was successfully moved into pBR3 22 to 

construct pD2/IC-30P. Full-length, infectious DEN -2 16681 
genomic RNA was. subsequently transcribed from pD2/IC-3 0P. 

The DEN- 1 PDK-13, DEN- 2 PDK-53, DEN- 3 PGMK-3 0 /FRhL-3 , 
and DEN- 4 PDK-48 vaccine viruses were obtained from 

2 0 Mahidol University. Our goal . involved replacement of the 

entire genomic cDNA backbone of the DEN- 2 16681 full- 
length clone with the cognate cDNA 'cloned from the genome 
of the DEN -2 PDK-53 candidate vaccine virus. The prM and 
E genes of the- DEN -2 PDK-53 virus are then replaced with 
25 the prM and E genes of the DEN- 1 PDK-13 , DEN- 3 

PGMK30 /FRhL-3 , and DEN- 4 PDK-48 candidate vaccine viruses 
to construct chimeric DEN-2/1, DEN-2/3, and DEN-2/4 
viruses containing the nonstructural genes of the DEN- 2 
PDK-53 virus and the prM and E genes of the heterologous 

3 0 DEN viruses . 
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It is contemplated that chimeric, infectious clone- 
derived DEN-2/1, DEN-2/3, and DEN-2/4 viruses will result 
in immediate improvement in the efficacy of a guadravalent 
vaccine. Our preliminary data from Mahidol University 
indicate that very small amounts of the DEN- 2 PDK-53 
vaccine virus were required to infect and immunize humans. 
However, the DEN-1, DEN-3, and DEN-4 vaccine virus 
candidates had approximately 30-fold to 2000-fold lower 
infectivity for humans. The low infective efficacies of 
the DEN-l , DEN-3, and DEN-4 viruses create significant 
problems in terms of vaccine efficacy in eliciting ' 
seroconversion in vaccinees, as well as problems of 
vaccine production for mass vaccination programs, since a 
large volume, up to 1 ml , of undiluted cell culture- 
derived vaccine virus must be administered to achieve even 
minimal levels of infectivity for these viruses. Since 
the increased infectivity of the.DEN-2 PDK-53 vaccine 
virus is. likely due to more efficient virus replication, 
and since this replicative efficacy is controlled k>y the 
nonstructural proteins of the virus, then chimeric vaccine 
viruses that express the relevant immunogenic structural 
proteins of DEN-1, DEN-3, or DEN-4 virus in the context of 
replication control by the nonstructural gene products of 
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the DEN- 2 PDK-53 virus should replicate better and be more 
infective and immunogenic in human vaccinees than the 
original DEN- 1, DEN- 3 , and DEN -4 vaccine viruses 
containing nonchimeric genotypes . • 

5 

A quadravalent vaccine is obtained upon completion of 
the following steps : 

"(1) A full genome-length infectious cDNA clone for a 
10 DEN- 2 virus, strain 16681, is constructed. 

(2) A full genome-length infectious cDNA clone for a 
DEN2-16681 attenuated derivative, PDK-53, is 
constructed, preferably by substituting the 
15 genomic cDNA backbone of the DEN2 -16681 full 

length clone with the corresponding cDNA cloned 
from the genome of the DEN- 2 PDK-53 candidate 
vaccine virus. 

2 0 (3) The candidate DEN-1, DEN- 3 , and DEN- 4 vaccine 

viruses are subjected to PCR amplification of 
cDNA from extracted genomic RNA, and chimeric" 
infectious cDNA clones expressing the prM and E 
genes of DEN-1, DEN- 3., and DEN- 4 viruses, 

•25 respectively, in the context of the 

nonstructural genes of the DEN -2 PDK-53 virus 
are constructed. 
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(4) The infectious clone-derived' chimeric DEN-2/1, 
DEN-2/3, and DEN-2/4 vaccine viruses axe tested 
to ensure that they: 



10 



(a) Are viable; 

(b) Express appropriate virus-specific 



immunogens ; 



culture; 



and 



(c) Replicate to sufficient titer in cell 

(d) Are infectious and immunogenic for humans; 

(e) Retain . phenotypic markers of attenuation. 



There is no good animal model for investigating 
15 dengue pathogenesis. DEN viruses are naturally 

transmitted between mosquitos and humans. Although lower 
primates can be infected with these viruses, they do not 
develop the clinical prof iles that occur in hrumans. 
Infectious clone-derived viruses can be compared to their 
20 more virulent parental strains using certain In vitro and 
in vivo markers: 

In Vitro Markers: 



25 



Plaque size in cell culture; 
Temperature sensitivity; 

Cytopathic effects (CPE) in LLC-MK 2 cells; and 
Replication in macrophages. 
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.In Vivo Markers: 

Virulence by intracranial route- in mice; ' 
: Viremia in monkeys; " 

5 Virulence by intracranial route in monkeys; and 

Elicitation of neutralizing antibodies in . 

animals. 

Infectious cDNA. clones are expressed, the resulting 
10 RNA transcripts are transfected into permissible cells, 
and the live, attenuated viruses are formulated into 
vaccines. 

Additionally, the DEN -2 PDK-53 and chimeric DEN- 
2/1, DEN- 2/3, and DEN- 2/4 infectious cDNA clones can by 

15 themselves confer immunity by DNA immunization, a form of 
gene therapy involving the direct inoculation of naked DNA 
into the host such that its expression produces an immune 
response {e.g. , Ulmer et al . , 1993 (DNA immunization 
protected against influenza) ; Cox et al. , .1993 (DMA 

20 immunization protected against herpesvirus); Xiang. et al . ',' 
1994 (DNA immunization protected against rabies) ; Sedegah 
et al . , 1994 (DNA immunization protected against 
malaria) ) . 

Moreover, infectious cDNA clones are exquisite tools 
2 5 for studying the molecular biology of virus structure, 
function, and replication. This has been amply 
demonstrated for many RNA viruses in the literature, 
including Venezuelan equine encephalitis virus as reported 
by Kinney et al . (1989). A successful infectious cDNA 
30 clone of DEN- 2 virus permits important investigations of 
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dengue virus replication, pathogenesis, and antigenic 
structure. Infectious clone cDNA templates permit the ' 
AtfecCerf engineering of virus vaccines. Directed site- 
specific, nonrandom mutations can readily be made in 
5 infectious cDNA clones, and therefore in clone-derived 

viruses, using a wide, variety of DNA modification enzymes, 
restriction endonucleases , and in vitro mutagenesis 
methods. DNA is easier to manipulate than RNA, and the' 10" 
9 error rate of DNA replication is much lower than the 10- 
- 10- error rate produced by RNA polymerases. Infectious 
cDNA clones permit direct analyses of the phenot-ypic 
effects of individual and cumulative mutations in the 
viral genome. An infectious cDNA clone provides a "gold 
standard" reference sequence for a vaccine. 



Particular aspects of the invention may be more 
..readily understood by reference to the following examples, 
which are intended to exemplify the invention, without 
limiting its scope to the particular exemplified 
20 embodiments.. 
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EXAMPLES 
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ynforptatjon; 

. Most of the background, protocols, and recipes used 
in recombinant DNA work can be found in Molecular Cloning 
A Laboratory Manual (Sambrook et al., 1989), and Current - 
Protocols in Molecular Biology (Ausubel et al . , 1989). 

Viruses t 



The virulent parental DEN -2 16681 strain was 
immediately available in the DVB ID collection of viruses. 

15 We received the DEN- 1 PDK-13, DEN- 2 PDK-53, DEN- 3 PGMK- 
30/FRhL-3, and DEN- 4 PDK-48 vaccine viruses from Mahidol 
University. The DEN vaccine viruses were passaged in 
primary dog kidney (PDK) cells because this cell culture 
is included among those cell types that are certified for 

20 human use by the Bureau of Biologies, US Food and Drug 
Administration (Yoksan et al . , 1986). The virus strain 
designations are shown below: 



25 



Virus 



Parent 



Vaccine 

Derivative 

Strain 



30 



DEN - 1 
DEN -2 
DEN -3 



16007 PDK-13 
16681 PDK-53 
16562 PGMK-3 0/FRhL-3 
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DEN -4 103 6 PDK-4 8 

PDK = primary dog kidney cells 
FRhL = fetal rhesus, lung cells 
5 PGMK = primary green monkey kidney cells 

DEN- 1 16 007 ParPnh 

► Recovered from serum of a patient with hemorrhagic fever 

and shock in Thailand in 1964 
10 ► Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 

- Passaged 2X in Toxorhynchites amboinensis mosquitos 

► PDK-l 

J • 
PDK- 4 3 Vaccine 



15 



20 



DEN- 2 1 fiSRI Pa-r^nt- 



* Recovered from serum of a patient with hemorrhagic fever 

and shock in Thailand in 1964 
- Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 
>- Passaged 2X in Toxorhynchites amboinensis mosquitos 
► PDK-l 

I • 
PDK- 53 Vaccine 



25 DEN- 3 1fiS 62 PsrP.nl- 



► Recovered from serum of a patient with hemorrhagic fever 

and shock in the Philippines in 1964 

► Passaged 3X in BS-C-1 cells, IX in LLC-MK 2 cells 

. - Passaged 2X in Toxorhynchites amboinensis mosquitos 
3 0 * PGMK-l 
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PGMK-3 0 DEN -3 virus grown in PGMK cells 

l replicated^ to very low titer in 

PDK FRhL-3 Vaccine cells (Yoksan et al . , 1986) 

5 

DEN -4 103 6 Parent: [ 

Recovered from serum of a patient with dengue fever in 
Indonesia in- 1976 
* Passed 4X "in Aedes aegypti mosquitos 
10 — PDK-1 
i 

- PDK- 4 8 

The DEN- 2 full-length cDNA clone was derived from the 
15 DVB ID seed of DEN - 2 16681 virus, which had the passage 
history: 

Human 

3X BS-C-1 cells 
20 2X LLC-MK 2 cells 

2X r. amboinensis mosquitos 

4X C6/36 cells (Aedes albopictzus) 

Complementary DNA (cDNA) was amplified by RT/PCR 
25 directly, without further cell culture passage, fr*om virus 
present in vaccine vials of the DEN- 1 PDK-43, DEN- 2 PDK- 
53, DEN- 3 PGMK- 3 0 /FRhL-3 , and DEN -4 PDK- 4 8 viruses.. 

Stock virus seed was prepared from virus-infected 
cells grown in 75 or 150 cm 2 plastic tissue culture flasks. 
3 0 The culture medium was clarified by centrif ugation for 3 0 
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min at 10,000 rpm in a Sorvall GSA rotor, bringing the •■ 
final concentration of fetal bovine serum (FBS) to 10% 
(v/v) , and then freezing the clarified virus suspension in 
aliquots of 0.5 - 1.0 ml at -70°C. Gradient purified DEN- 
'S 2 16681 virus' was prepared according to the method of 
Obijeski et al . (1976) as reported by Kinney et al . 
(1983) . 

Cell Lines: 

10 

Infectious virus was derived from the infectious cDNA 
clones by electroporation of BHK-21-15 (baby hamster 
kidney-21, clone 15) cells with transcribed viral RNA. 
Viruses were also grown in LLC-MK 2 monkey kidney cells, 

1-5 Vero African green monkey kidney cells, and C6/36 mosquito 
cells (Aedes albopictus C6 cells, clone 36, Igarashi 
(1978) ) . All four cell lines were grown in Eagle's 
minimal essential medium (MEM) supplemented with 10% (v/v) 
heat -inactivated (56°C for. 30 min) FBS, 1.25 g/L. of sodium 

20 bicarbonate, 100 units/ml of penicillin G, an.d 100 /ig/ml 
of streptomycin sulfate. Confluent cell monolayers grown 
in plastic tissue culture flasks were infected by ' 
. decanting the growth medium, permitting the virus inoculum 
to adsorb for 1.5 h at 3 7°C,'and then adding MEM 

25 containing 5% FBS. For plague titration of viruses, 

confluent cell monolayers in plastic 6 -well trays were 
inoculated with 200 /^l of the appropriate dilution of 
virus. Virus was adsorbed to the cell monolayer for 1.5 h 
at 3 7 °C. The cells were then overlaid with 3 ml of 1% 

30 (w/v) Noble agar (maintained at 40°C) in MEM lacking 
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'phenol red pH indicator and containing 2% FBS and 0.01% 
(w/v) DEAE-dextran. Following incubation for 6 days at 37 
°C in a 5% C0 2 atmosphere, a second 1-rnl agar overlay 
containing 50 /^g/ml of neutral red vital stain was added. 
5 Viral plagues were counted 2-5 days later. 

The- E. coli K-12 strains used in this project 
10 included XLl-Blue, MC-1061, SURE, JM101, and TB-1. 

Recombinant plasmid containing full genome -length cDNA of 
DEN - 2 virus was successfully replicated in E. coll XLl- 
Blue, MC-1061, and TB-1. Flavivirus cDNA, particularly 
the gene region encoding the envelope glycoprotein, is 
15 troublesome in E. coli. Bacteria hosting -the recombinant 
plasmid containing the full-length cDNA clone grew slowly 
and were often difficult to streak for isolation on agar 
plates containing selective antibiotic . Transformation 
efficiencies were sometimes improved somewhat by 
20 incubation of agar plates at 30°C or ambient temperature 
rather than at 37°C. Bacterial stocks were stored frozen 
at -70°C in 10% (v/v) glycerol. 

Precautions for Working with RNA: 

25 

RNA is a fragile molecule that is very readily 
degraded by the many ubiquitous RNases present in the 
environment. Many of . these RNases are resistant to 
treatment with detergents and heat, including autoclaving. 
30 All reagents and materials that contacted the viral RNA in 
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this project were RNase-free to avrt -^ * 

rree to avoid degradation of th- 

viral RNA by these ubiquitous, verv ata M« . 

-a WLL& ' v «ry stable enzymes. The 

investigator wore tioht-f itn n,-, ~i ~ . 

9 rittmg gloves, maintained all 
reagents on ice, used « plastic tool; to ^ ^ ^ 

■mcrotubes. used individually packaged, pipets, preferably 
Plastic for aqueous solutions, disposable plasticware 
which is generally RNase-free before opening, and used 
"For RNA Only .icrotubes, Gilson micropipets «- 10 p. 
20, P-XO0, P-200, p-xooo). and tips with aerosol barriers 
Use of recycled glassware was avoided, weigh boats 
magnetic stirrers, and pH m eters were not used. chemicals 
were weighed in sterile, RNase-free disposable plastic 50- 
ml centrifuge tubes, and solutions were adjusted to the 
appropriate pH by aliguoting a small volume of the 
solution onto pH paper, whenever possible, commercially ' 
prepared, guaranteed RNase-free reagents were purchased 
Otherwise, newly-opened chemicals were reserved "For RNA 
Only. Water and stock salt solutions> except ^ chose 

containing Tris, were treated overnight with 0.1% (v/v) 
diethylpyrocarbonate (DEPC) to inactivate RNases via 
alkylation and then autoclaved for 20 min. It is 

advisable to use the be<?f C f 0v i i= *.• i. . 

5 Cest ster ile technique when working 

with RNA. 



25 - Enaction of Viral genomig ™* *~ om y <wo - e ^. 

Virus seeds containing at least 10 s PFU; (plaque 
forming units) /ml of virus are ideal for providing 
appropriate yields of RNA. Seed with virus titer of io« or 
30 lower can be problematic in terms' of yielding sufficient 
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RNA. For these low-titer. seeds it is best to pool the 
yields of several extracted seed aliquots. 

RNA extraction involved the addition- of 200 u-1 of 
cold RNA lysis buffer (4 M guanidine isothiocyanate , 25 mM 
5 sodium citrate, pH 7.0,. 0.5% (w/v) sarkosyl, and 100 mM 

beta-mercaptoethanol) , and 30 /;1 of 3 M sodium acetate, pH 
5.2, to an empty RNase-free 1.5-ml microtube on ice. In a 
biosafety cabinet, 200 ^1 of DEN virus seed was added to 
the microtube and mixed vigorously for 30 sec with' a 
10 mechanical mixer. The tube was centrifuged briefly to 
pellet the liquid; then 400 //I of cold phenol 
(commercially supplied by AMRESCO) equilibrated to pH 4.5 
and 80 fxl of cold chloroform were added. The tube was 
mixed vigorously for 3 0 sec, placed on ice for 15 min, 
15 mixed again, then centrifuged for l min at maximum speed 
■ in a refrigerated microcentrifuge to separate the aqueous 
and organic phases. The top aqueous phase containing the 
extracted RNA was transferred to a fresh 1.5-ml microtube 
on ice, 40 0 yul of cold isopropanol was added, and the tube 
2 0 was incubated for at least 1 h or overnight at -20 °C. The 
RNA was precipitated by centrif ugation for 10 min at 
maximum speed at 4°C. The supernatant was removed with a 
pipet rather than by decantation and rinsed with 5 00 f^l of 
• 75% (v/v) ethanol. After spinning again for 10 min, the 
25 ethanol was removed with a pipet. The tube was 

centrifuged again briefly and the residual' liquid was 
removed with a micropipet. The RNA pellet' was air dried 
briefly, resuspended in 50 m1 of cold RNase-free dH 2 0, and 
stored frozen. For seeds containing low virus titer, the 
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RNA pellets in 3-6 micro tubes were pooled in a total 
volume of 50 m1. 

RT/PCR gyntfregis of Dengue' Vims-W^^ cDNA Fr amo ^ e 

Full-length genomic mRNA was extracted directly from 
2 00 fxl of DEN virus seed. The standard reverse 
transcriptase/polymerase chain reaction (RT/PCR) was 
" performed in a 10 0-,il reaction solution containing 5 -is fil 
of the extracted viral RNA, 1 fil each of 100 zzM stock 
solutions (stored frozen in dH 2 0) of the upstream mRNA- 
sense primer- amp limer and downstream complementary- sense 
primer-amplimer, 10 fil of 10X standard PCR buffer (500 mM 
KC1, 100 mM Tris-HCl, pH 8,5, 15 mM MgCl 2 and 0.1% (w/v) 
gelatin), 8.0 fil of 2.5 mM dNTPS- (2.5 mM each of dATP, 
dCTP, dGTP , and dTTP; Pharmacia-LKB) , 0.5 fil of 1 M 
dithiothreitol (DTT) , 0.5 /zl .of RNase inhibitor (RUasin, 
40 U//zl, Boehringer-Mannheim) , 0.5 /xl of Taq DNA 
polymerase (5 TJ/fil, Perkin-Elmer) , and 0.5 /zl of RAV-2 
reverse transcriptase (18 U/fil, Takara) . The reaction 
solution was made as two components: 
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PCR. Reaction Mix: 



10 



15 



20 



25 



30 



35 



40 



0.5 Ml 
60.0 Ul 
80.0 fil 



► Template/Primer Mix: 



* Reaction Solution: 



10.0 fil 10X Standard- PCR Buffer 

8 . 0 fil 2 . 5 mM dNTPs • 

0.5 fil - 1 M DTT 

0..5 ill RNasin (40 U//xl). 
. 0.5 fil' Taq DNA Polymerase (5 
V/fil) 

RAV-2 RT (18 U/fil) 

RNase-Free dH.O , 

Reaction Mix for 1 
reaction. Make more 
than needed for all 
reaction tubes. Store 
excess at -70°C for 
reuse . 

18.0 ill DEN- 2 RNA Template 
1.0 fil 100 fiM Up-Ampl'imer 
1.0 fil 100 fiM Down -Amp limey 

20.0 fil 

80.0 fil 
30.0 lAl 
100.0 fil 



PCR Reaction Mix 
Teinplfrte/Primgr* Mix 
In a thin-walled, 2 0 0- 
fil microtube. 



The RT/PCR reactions in thin-wall 200-/xl microtubes 
(Phenix Research Products) were incubated without oil 
overlay in a Perkin-Elmer Model 9600 thermocycler 
according to the following program: 



50 °C for GO min « 

94°C for 4 min 
50°C for 1 min 
72°C for 5 min 



First strand cDNA synthesis 
by reverse transcriptase 



94°C for 30 sec 

55°C for 30 sec 
72°C for 5 min 

Delta +10 sec/cycle 



3 0 Cycles 
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Following completion of the RT/PCR reactions, 5-/zl 
aliguots of each of the 100-/xl reactions were analyzed by 
agarose gel electrophoresis The DNA bands in the agarose 
gel were stained in ethidium bromide (500 ng/ml) solution 
5 and visualized on an ultraviolet light box. Since 

extraneous non-target cDNA bands are often amplified in 
addition to the target cDNA molecules, the remaining 95 /xl 
of each RT/PCR reaction was electrophoresed in a larger, 
preparative agarose gel, and the target cDNA was. stained 
.10 briefly, excised with a razor blade, and physically 
extracted from the agarose slice. 

High -Melt -Crush (HMC) Extraction of DNA from Agarose: 

1 ^ An agarose gel slice containing DNA was placed' in a 

1.5 -ml microtube and crushed thoroughly with a spatula or 
■, pestle. The volume of the crushed agarose was brought to 
400-500 jil with TE buffer (10 mM Tris-HCl, pH 7.5, 
1 mM disodium EDTA) and 4 00 pi of phenol (supplied by 

20 AMRESCO) , pH 8, was added. The agarose suspension was 

mixed vigorously using a mechanical mixer, frozen, thawed 
and mixed, frozen, thawed and mixed, and then centrifuged 
for 10 min at maximum speed at 4°C. The top aqueous phase 
was transferred to a fresh microtube, extracted with 400 

25 ^1 of phenol : chloroform: isoamyl alcohol (25:24:1) and 
centrifuged for 2 min. The top aqueous phase was 
transferred to a fresh tube and extracted with 700 ^1 of 
diethyl ether or chloroform. If chloroform was used, the 
top phase was again transferred to a fresh tube after a 

30 brief spin to separate phases. The DNA was precipitated 
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for at least 30 mi-n at -70°C or overnight- at -20°C 
following addition of 2.5 volumes (essentially filling the 
microtube) of 95% ethanol containing 3 00 mM ammonium 
acetate and 10 mM MgCl 2 . The DNA was pelleted at 4°C by 
5 centrif ligation for 2 0 min at maximum speed. The liquid 

was decanted, and -the DNA pellet was rinsed with 5 00 ^1 of 
75% ethanol, air-dried briefly, dissolved in 30 pi of TE 
buffer, and stored frozen or in the refrigerator. A 3-//1 
aliquot of the extracted DNA was analyzed ' for purity and 
10 quantity by agarose gel electrophoresis. Generally, 

2 0-80% of the DNA loaded onto a gel can be recovered from 
the gel by this method. 

Agarose Gels: 

15 . 

DNA was analyzed by electrophoresis in 1% (w/v) 
agarose gels run in TBE buffer (100 mM Tris-HCl, pH 8, 91 
mM boric acid, and 20 mM disodium EDTA) . DNA bands were 
visualized by staining the gel in water containing, 500 

20 ng/ml of ethidium bromide and exposure to ultraviolet 

light. Gels used for analyzing RNA transcripts were made 
with RNase-free reagents. Ethidium bromide stain was 
incorporated in the gel and running buffer so that the RNA 
bands could be visualized immediately. To obtain gel- 

2 5 . purified DNA fragments, DNA was electrophoresed in 0.7% 
(w/v) agarose gels made with genetic technology grade 
Seakem agarose (FMC) or with biotechnology grade agarose 
(3:1 high resolution blend, AMRESCO) . 
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Cloning- of Dengue Virus- Specific cD NA Fragments: 

Some DNA polymerases acid an extra "A" nucleotide 
5 'overhang at the 3 ■ -end of synthesized DNA strands. The . 
Taq DNA polymerase does this. To enable the cloning of 
DNA molecules synthesized using Taq DNA polymerase, TA- 
cloning vectors have been engineered (Marchuk et a.1 . , 
1.991). These vectors generally have a single >T M overhang 

10 engineered at the 3 ■ -terminus of EcoRV-cut, blunt-ended, 
linearized plasmid vector. The EcoRV site occurs within 
the multiple cloning site (MCS) of the plasmid. The MCS 
- is a- series of contiguous,' unique restriction enzyme 
(RENZ) sites engineered into a vector plasmid to permit 

15 - subcloning of exogenous DNA fragments following 

restriction with a variety of RENZs . The HMC-purified DEN 
cDNA amplicons were cloned into the 3900-bp pCRU 
( Invitrogen) , the 2887-bp pT7Blue (R) (pT7Blue, Novagen) , 
or the 3003 -bp pGEM-5Zf (Promega) TA-vector plasmid. The 

2 0 RENZ sites available in the MCS region of these TA- 
vectors, as well as the RENZ site's of the MCS of the 
general purpose cloning plasmids, pUC18 and pUC19, used in 
this project are shown below. • ;j 
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RENZ Sites Present in the MCS' o f Several Cloning Vectors 



pUC18 



pUC19 



pT7Blue 



pCRII 



p G EM-52f 



10 



15 



20 



25 







T7 


SP6 . 


T7 


EcoRI 


Hindu I 


Hindi I I 


Nsil 


Apal 


SstI 


SphI 


BspMI 


Hindi I I 


Aatll 


Kpnl . 


PstI 


SphI 


Kpnl 


SphI 


Smal 


Sail 


PstI 


SstI 


. Ncol 


BamHI 


Xbal 


Sse8387I 


BamHI 


Sstll 


Xbal 


BamHI 


Sail 


Spel 


EcoRV 


Sail 


Smal 


AccI 


- BstXI 


Spel 


PstI 


Kpnl 


Hindi 


EcoRI 


NotI 


SphI 


SstI 


Xbal 


EcoRV 


PstI 


Hindi I I 


EcoRI 


Spel 


EcoRI 


Sail 






Ndel 


PstI 


Ndel 






EcoRV 


BstXI • 


SacI 






BamHI 


NotI 


BstXI 






■Aval 


Aval 


Nsil 






Smal 


SphI 


SP6 






Kpnl 


Nsil 








SacI 


Xbal 








Banll 


Apal 








EcoRI 


T7 





30 



The pUC18/19 plasmids possess identical MCS sites in 
reverse orientation in the plasmid backbone. Their 
purpose is to permit cloning of DNA in either orientation 
into the plasmid using the same pair of RENZs - this 
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reversibility was exploited in this project. The TA- - 
vectors used here all possessed T7 and/or SP6 
bacteriophage RNA promoters to enable RNA . transcription 
from cloned DNA. These promoters were not used in this 
5 project. All of the plasmids contain the. gene for 

ampicillin resistance. They also contained the lac Z 
portion of the E. coli lac operon. This permits color 
discrimination between bacterial colonies that receive a 
recombinant or a wild- type plasmid. In the presence of* 

10 IPTG and X-gal, bacterial colonies that are transformed 
with a wild-type plasmid lacking a cDNA insert develop a. 
blue color, whereas cells that receive a recombinant 
plasmid with cDNA cloned into the MCS. of the plasmid are 
white. Agar plates contained 8 00 fig of IPTG and 8 00 jj.g of 

15 X-gal-. 

Fifty to 100' ng of .HMC-purif ied amplicon was ligated 
to 50 ng of the pCRII vector using the TA-vector cloning 
kit supplied by Invitrogen exactly as specified by the 
instructions supplied with the kit. Frozen, 

20 transformation competent EL. coli INVaF 1 cells, supplied 
with the Invitrogen kit and stored at -70°C / were 
transformed with .the ligated DNA as described in the kit 
instructions. The transformed cells were plated on YTA 50 
agar plates (8 g of DIFCO tryptone, 5 g of DIFCO yeast 

25 extract, 5 g of NaCl, and 15' g of BACTO agar per liter of 
dH 2 0) containing 50 jxg/ml of ampicillin. Only bacterial 
cells transformed with the pCRII plasmid, which contains 
an ampicillin resistance gene, grow on this medium* The 
agar plates were incubated at 37°C overnight. 
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Similarly, cDNA was . ligated ..to the other TA-vectors 
or to pUC18/l9 cut with the appropriate RENZ(s) . 
Ligations were performed at room temperature or at 12°C. 
E. coli XLl-Blue, .SURE, TB-1, or MC-1061 cells were 
5 transformed by electrpporation and plated on YTA 50 plates. 
Electroporation was performed according to Dower et al . 
(1988) using cuvettes with a 2-cm electrode gap in a Bio- 
* Rad Gene Pulser set at 2.5 kV voltage, 25 /zF capacitance, 
and 200 ohms resistance. Electroporation-competent cells 

10 were prepared by growing a fresh bacterial culture to an 
optical density of 0.5-0.7 at 600 nm. The cells from 1.5 
- 3 L of culture were pelleted by centrif ugation for 10 
min at 4°C and 5000 rmp in a Sorvall GSA rotor, pooled, 
washed twice in 1 mM Hepes buffer, and resuspended in 2 ml 

15 of 10% (v/v) sterile glycerol per L of original culture. 
The concentrated cells in glycerol were stored at -70°C. 

Bacterial colonies were transferred to 2 ml of 2XYT- 
Amp 50 broth (16 g of tryptone, 20 g of yeast extract, and 5 
g of NaCl per liter of dH 2 0) and incubated overnight with 

20 shaking at 300 rpm at 37°C in a floor model incubator - 
shaker (model Innova 4300, New Brunswick). Recombinant 
plasmid was extracted from these 2 -ml minicultures and 
analyzed by agarose gel electrophoresis for the presence 
of cDNA insert. Recombinant plasmids are larger than wild 

25 type vector plasmid because of the cDNA insert, and they 
migrate more slowly than wild type plasmid in agarose 
gels. 

All of the DEN-2 . 16681 virus-specif ic cDNA amplicons 
were cloned into the pCRII TA- vector. Aliguots of insert - 
3 0 positive miniprep plasmids were digested with the 
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restriction enzyme EcoRI . since the pCRii M CS contains 
two EcoRI recognition sites (palindromic hexameric 
sequence GAATTC) on either side of the EcoRV cDNA cloning 
site, this RENZ cleaved the cDNA insert from the plasmid 
5 vector and cleaved any EcoRI sites that were present 
within the cDNA itself. The EcoRI-restricted DNA was 
analyzed by agarose gel electrophoresis to determine that 
the cloned cDNA was of appropriate size. m our 
experience, clcning of PCR-derived cDNA amplicons 2000 bp 
3 or smaller in size into the TA-vector is efficient. 

Cloning amplicons larger than 3500 bp into the TA-vector 
can be very difficult. 

After screening, certain of the miniprep plasmids 
were selected for further analysis. Their corresponding 
bacterial minicultures were streaked for isolation on YTA S0 
plates, and an isolated colony was inoculated into 
50-200 ml of YTA S0 broth to grow up a preparative amount of 
recombinant plasmid. The preparative scale for the 
extraction of the plasmid was essentially identical to 
that for minipreps except for scaled up volumes. 



25 



30 



White colonies containing recombinant plasmid were 
picked with a sterile toothpick and shaken" overnight at 
300 rpm in 2 ml of 2X-YTA S0 broth. Each miniculture was 
decanted into a 1.5 -ml microtube, and the cells were 
pelleted by centrif ugation at 6000 rpm for 2 rain". The 
supernatant was aspirated, and the cell pellet was 
resuspended gently by up/down micropipet ing in 200 M l of 



WO 96/40933 



PCT/US96/09209 



.46 

GTE. buffer (50 mM glucose, 25 mM Tris-HCl, pH 8.0, and 25 
mM disodium EDTA) and then mixed with 3 00 pi of lysis 
buffer (0.2 N NaOH, 1% (w/v) sodium dodecylsulf ate (SDS) ) . 
After incubation on ice for 5 min,. 3 00 ^1 of cold 
5 potassium acetate solution (3 M potassium acetate, 7 M 
acetic acid, pH 4.8) was added, and the solution was 
chilled for 5 min on ice and then centrifuged at maximum 
speed for 10 min at 4°C. The supernatant was poured into 
a fresh microtube, RNase A was added to 20 jug/ml, and the 

10 mixture was incubated at 3 7 *C for 3 0 min. The sample was 
extracted twice with 60 0 pi of chloroform and centrifuged 
for 1 min at maximum speed at room temperature. The DNA 
pellet was dissolved in 32 a<1 of dH 2 0. Eight ^1 of 4M NaCl 
and 40 ^1 of 13% (w/v) PEG-8000 was added, and the mixed 

15 solution was incubated for 5 min on ice. The sample was 

centrifuged for 15 min at maximum speed at 4°C, the liquid 
was aspirated with a micropipet , and the pellet was 
rinsed with 500 ^1 of 75% ethanol. The air dried pellet 
was dissolved in 30 ^1 of dH 2 0 and stored frozen until 

2 0 used. 

Extraction of Plasmid DNA from Large Cultures of E.. colii 

Preparative -scale plasmid extraction was performed by 
25 inoculating 100 ml of 2X~YTA 50 broth with 2 ml of an 

overnight culture of E. coli. The culture was shaken 
overnight at 300 rpm and 3 7°C. The cells were pelleted by 
centrifugation for 10 min at 5000 rpm in a Sorvall GSA 
rotor and resuspended in 6 ml of cold GTE buffer. Nine ml 
30 of a freshly made solution of 0.2 N NaOH and 1% (w/v) SDS 
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was added. The sample was ■ incubated for 5 min on ice, ■■ 
then 9 ml of cold 3 M potassium acetate solution was 
added. After another 5 -min incubation on ice, the tube 
was centrifuged for 20 min at 10,000 rpm at room 
5 ' temperature and the supernatant was transferred to a fresh 
3 0 -ml glass tube. RNase A was added to 2 0 ^g/ml, and the 
sample was incubated for 3 0 min at 37°C and then extracted 
twice with 6 ml of chloroform. Twelve ml of room- 
temperature isopropanol was added and the tube was * 

10 centrifuged immediately for 20 min at 10,000 rpm at room 
temperature. The supernatant was decanted, and the DNA 
pellet was rinsed. with 1 ml of 75% ethanol, air dried 
briefly, and resuspended in 480 4 of dH 2 0. The DNA was 
precipitated by addition of 120 4 of 4 M NaCl and 600 ^1 

15 of 13% PEG- 8000, incubation for 5 min on ice, and 

centrifugation for 15 min at maximum speed at 4°C. The 
DNA pellet was rinsed with 500 ^1 of 75% ethanol, air 
dried briefly, rehydrated in TE buffer, and stored frozen. 

2 0 Nucleotide Sequence Analysis of the Den one cDNA Clones; 

Nucleotide sequence analyses of DEN- 2 16681 cDNA 
clones #1-#15 were performed by cloning . EcoRI restriction 
fragments of each clone into the single-stranded 

25 bacteriophage M13mpl8 or M13mpl9. Since this is not the 
current method of choice for sequencing, the method will 
be described only briefly here. The procedure used for 
the extraction of plasmid DNA from bacterial cells was 
also used to extract the intracellular double- stranded 

30 replicative form (RF) DNA of M13 from bacteriophage- 
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infected E. coli JM101 cells. The RF DNA was linearized 
at the EcoRI site of the MCS and ligated to the DEN- 2 HMC- 
purified EcoRI cDNA restriction fragments-. 
Electroporation-competent E. coli JM101 cells were 

. 5 transformed by elec.troporation and plated onto H-agar 

plates (10 g of DIFCO tryptone, 5 g of NaCl, 15 g of BACTO 
agar, and 1% (w/v) thiamine per liter of dH 2 0) containing 
800 /ig each of isopropyl- (J-D-galactopyranoside (IPTG) and 
5-bromo-4-chloro-3-indolyl-[5-D-galactopyranoside (BCIG or 

10 X-gal) . The electroporated cells were mixed with 300 pi 
of a fresh logarithmic culture of JM101 cells and 3 ml of 
warm (51°C) top H-agar containing 9 g/L of agar and then 
poured onto the H-agar plates. Cells that were 
transfected with recombinant DNA supported replication of 

15 recombinant M13. virus, resulting in the formation of 

bacteriophage plaques in the JM101 cell lawn on the agar 
plate. The . IPTG/BCIG histochemistry of the system 
permitted identification of white plaques containing 
recombinant bacteriophage into which cDNA had been ligated 

20 into the EcoRI site of the MCS , whereas wild- type 

nonrecombinant M13 bacteriophage produced blue plaques . 
Isolated plaques were picked, inoculated into 3 ml of a 
fresh, pre -logarithmic phase culture of JM101, and shaken 
at 37°C for 8-16 h. The minicultures were clarified by 

25 centrifugation in 1.5 -ml microtubes, the bacteriophage 

particles were precipitated with PEG-8000, and the single- 
stranded, circular bacteriophage DNA was isolated from the 
virions by phenol extraction. The recombinant, circular, 
single -stranded bacteriophage DNA was sequenced by the 

3 0 dideoxynucleotide termination method. Sequencing kits can 
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be purchased from various .commercial vendors. Radioactive 
32 P-dCTP or "S-dCTP was incorporated into the strands 
synthesized in the sequencing reactions. ■ Sequencing was 
accomplished with many DEN- 2 virus-specific primers 
designed to sequence the entire genome. The sequence 
reactions were electro-phoresed' in 6% (w/v) polyacryl amide 
gels, which were dried onto filter paper and overlaid with 
X-ray film. The DNA bands of the autoradiography were 
read by the investigator, and the data was entered into a 
sequence project data spreadsheet. This sequencing method 
has been used extensively in the past (e.g., Kinney et 
al., 1986; Johnson et al . , 1986; Deubel et al., 1986; 
Deubel et al . , 1988; Kinney et al., 1989; Trent et al . , 
1987) . 

Nucleotide sequencing was also performed by the 
current method of direct sequencing of double -stranded 
plasmid DNA by the dideoxynucleotide termination method 
using the Applied Biosystems Taq DyeDeoxy Terminator Cycle 
Sequencing Kit, cycle sequencing in the Model 9600 
thermocycler according to the instruction manual supplied 
with the kit, and analyzing the DNA sequence on ah ABI 
Model 373A DNA sequencing apparatus. Sequencing reactions 
in 200-^1 thin-walled microtubes contained 9.5 ' M \ of 
reaction mix (buffer, the four dideoxynuleotides , and Taq 
25 polymerase supplied in the kit), 7.0 M l of double or 

single -stranded template DNA (150 pg/bp) , and 3 . 2 M l of 
10 M M sequencing primer (32 pmol) . After mixing, the 
reactions were placed in a Perkin-Elmer Model 96 0 0 
thermocycler, and programmed cycle sequencing was 
performed for 25 cycles of incubation at 96°C for 15 sec, 



20 



30 



WO 96/40933 

PCT/US96/09209 

50 . " - 

50 °C for 15 sec, and 60°C for 4 m-ir, 

„ * n - Stra *d extension was 

performed at eo-c rather than 72 . c because fluore aS 
dye-labeled dideoxynucleotide terminators , re 
sensitive. The reaction was then appl ied to . Centr 
gel column (Princeton Separations) to remove 
unincorporated dye-labeled dideoxynucleotides according to 
the instructions supplied with the columns. The eluted 
DNA was vacuum dried for 1 h using a Savant Speed Vac 
Concentrator and stored at -70'C. The DNA was nydrated 
With 5 „1 of deionired formamide and 1 ul of 50 mM 
disodium EDTA, then heated in an aluminum block for 2 min 
et 90-c. A 3-„l aliquot of the denatured DNA sample was 
applied to one of 34 wells of a polyacrylamide-urea gel in 
an Applied Biosystems 373A DMA sequencer. The color-coded 
. sequence chromatograph was read by visual inspection, and 
the resulting nucleotide sequence was entered into a 
computer-maintained sequence data spreadsheet . The 
sequencing kit incorporates dideoxynucleotide terminators 
that are each labeled with a unique fluorescent dye that 
permits laser detection of all four terminators in a 
single polyacrylamide gel lane in the Model 373 sequencer 
The data was recorded in the form of colored chromatograms 
that are easily read by the investigator. Single-stranded 
recombinant M13 DNA can also be sequenced in this manner. 

fatrsPtjon sf 0X3 't m i- 8tel ,H.H ^ f - r , ,, , 

White bacteriophage plaques containing recombinant 
M13 DNA were picked with sterile toothpicks and placed 
into 2-ml slightly turbid (less than 0.1S A„ t , cultures of 
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E. coli JM101. The cultures were shaken at 300 rmp and 
37T overnight and then clarified by centrif ugation in 
microtubes at maximum speed for 10 min at room 
temperature . One ml of the supernatant was transferred to 
a fresh 1.5-ml microtube containing 200 /.I of sterile 20% 
(w/v) PEG-8000 in 250 mM NaCl . The tubes were mixed by 
inversion, incubated for 15- min at room temperature, and 
centrifuged at maximum speed for 5 min at room • 
temperature. The PEG supernatant was removed' completely, 
and the DNA pellet was resuspended in 3 00 M l of TE buffer. 
An equal volume of pH 8-buffered phenol was added, and the 
• solution was mixed vigorously several times during a. 
period of 2 0 min at room- temperature . The tube was 
centrifuged for 5 min at room temperature, and the top 
15 aqueous phase was transferred to a fresh 1.5 -ml microtube. 
After sequential extraction with phenol : chloroform : isoamyl 
alcohol and chloroform, the DNA was precipitated by adding 
2.5 volumes of 95% ethanol containing 300 mM ammonium 
acetate and 10 mM .MgCl 2 . and incubating at -20°C overnight. 
The tube was centrifuged at maximum speed for 15 min at 
4°C, and the supernatant was decanted. Following a rinse 
with 500 M l of 75% ethanol, the DNA was air dried briefly, 
resuspended in 60 ^1 of TE buffer, and stored at 4'C . 

25 Primers ; 

Primer design was based on the sequence of DEN- 2 
virus, strain 16681, published by Blok et al . (1992) , and 
DEN- 2 virus, Jamaican strain 1409, as reported by Deubel 
30 et al. (1986) and Deubel et al . (19.88). 
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Primers were synthesized by the Biotechnology Core 
Facility at the CDC in Atlanta, Georgia. We received the 
dried primers via. mail and adjusted them to a 
concentration of 100 iM in dH 2 0. The designations and 
5 sequences of all of the primers- amplimers used in this 
project are listed in Appendix A. 

To amplify the 3' -end of the DEN- 2 virus genome, a 
downstream amplimer was designed that was complementary to 
the published sequence of the 3» terminus of the genome. 
10 A unique Xbal restriction enzyme site was incorporated at 
the. 5' end of this amplimer to provide a unique site to 
permit linearization of the recombinant plasmid containing 
the full-length cDNA clone at the 3' terminus of the 
cloned genomic cDNA. This linearization was necessary to 
15 obtain appropriately terminated DEN virus-specific run-off 
RNA transcripts from the cDNA clone in transcription 
reactions with bacteriophage T7 RNA polymerase. 
Linearization at this 3 ' -terminal Xbal site resulted in 
the incorporation of a 5 -nucleotide TCTAG extension to the 
20 3* terminus of the genomic mRNA transcribed from the full- 
length cDNA clone, of DEN -2 16681 virus, and a 4-nucleotide 
CTAG extension to the 3' terminus of RNA transcribed from 
the DEN- 2 PDK-53 cDNA clone. The difference between the 
two cDNA clones in the length of the extraneous 
25 3 i -terminal extension was due to the differently designed 
3 * -terminal amplimers used to obtain the 3 1 end genomic 
cDNA amplicon. Amplimer cD2 -106 87 .XBA or CD2-10687.X2 was 
used to amplify and clone the 3' -terminal portion of DEN - 2 
16681 or PDK-53 virus, respectively. 
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The promoter for" the bacteriophage T7 RNA polymerase 
was engineered -at the 5' terminus of the cloned genomic 
cDNA by incorporating the recognition sequence of the T7 
RNA polymerase into the sequence of the 5' -terminal 
5 upstream, mRNA-sense amplicon D2-SMT71 immediately 

preceding the 5 ' -terminal nucleotide of the DEN- 2 viral 
genome. This design ensured that the T7 RNA polymerase 
initiated RNA transcription at the 5 '-terminal nucleotide 
of the DEN- 2 virus-specific cDNA (Milligan et al . , 1987). 

10 Amplimers for PCR reactions were designed to take 

advantage of RENZ sites identified within the nucleotide 
sequence of the genome of DEN -2 16681 virus. cDNA 
molecules were amplified to permit ligation or splicing 
together of overlapping contiguous cDNA clones at shared, 

15 overlapping, unique RENZ sites (Figure 3) . 

Transcription of Geno mic mRNA from DEN Virus-Sp ec-if i> ; 
Full-Lenath cDNA Clones; 

The recombinant plasmid containing the full-length 

20 cDNA clone was prepared for RNA transcription by 

linearization at the unique Xbal site located at the 3 ■ 
terminus of the cloned genomic cDNA. The restriction 
reaction containing the Xbal -restricted plasmid was 
extracted sequentially with phenol : chloroform : isoamyl 

25 alcohol and chloroform and then precipitated. The DNA was 
redissolved in 50 ^1 of TE buffer and digested with 
proteinase K at a concentration of 1 mg/ml' for 1 h at 37°C 
to hydrolyze contaminating RNases. The sample was then 
extracted twice with "For RNA Only" 

3 0 phenol : chloroform: isoamyl alcohol buffered to pH 8, 
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extracted twice with chloroform to remove traces of 
phenol, and precipitated by adding one- tenth volume of 
RNase-free 3 M sodium acetate, pH 5.2, and 2.5 volumes of 
ethanol and incubating for at least 1 h at -70°C or 
5 overnight at -2 0°C. 

DEN- 2 virus -specif ic genomic RNA was transcribed from 
the linearized cDNA template using a commercial T7 
transcription kit (Ampliscribe T7 transcription kit , 
Epicentre Technologies) . Transcription reactions were 

10 performed for 2 h at 37°C in RNase-f ree 1 . 5 -ml microtubes 
in 20-^1 reactions containing 100-1000 ng of linearized 
DNA template, 7.5 mM. each of CTP, GTP, and UTP, 0.7 5 mM 
ATP, 2.7 mM m 7 GpppA cap analog, 6 . 7 mM DTT, 2,0 4 of a 
10X concentration of a proprietary buffer supplied with 

15 the commercial kit, and 2.0 ^ of the proprietary 
Ampliscribe enzyme solution supplied with the kit. 
Reaction solutions were used directly and without further 
treatment to transfect BHK-21 cells. 

2 0 Tranef ection of BHK-21 Cells with Genomic. RNA Transcripts; 

BHK-21 clone 15 cells were transfected with RNA 
transcripts by electrpporation (Liljestrom et al . , 1991). 
Fresh cultures of BHK-21 cells were grown to 90% 
25 confluency, rinsed twice with cold RNase-free phosphate 
buffered saline (PBS) , and released from the plastic by 
incubation with 3 ml of commercial trypsin-EDTA solution 
(GIBCO-BRL) . The cells were pelleted by low-speed 
centrif ugation at 1200 rpm for 5 min in a Beckman GPKR 

3 0 centrifuge. The cells were washed twice with' cold PBS, 
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. resuspended in cold PBS and kept on ice. The -cells were 
counted using a hemacytometer and microscope, and the cell 
concentration was adjusted to 10 7 cells/ml. One-half ml of 
the washed, adjusted cells were mixed with each 
5 transcription reaction solution in 1 . 5-ml. microtubes on 
ice. The mixture was transferred to a cold 
electroporation cuvette with 0.2 -cm electrode gap, which 
was placed in the cuvette holder of the Bio-Rad Gene 
Pulser. The cells were shocked twice using settings of 

10 1.5 kV voltage, 25 mFD of capacitance, and resistance set 
to infinity. The shocked cells were incubated for 10 min 
at room temperature and then added to 75 cm 2 tissue flasks 
containing 20 ml of MEM containing 10% FBS . Transfected 
cell cultures were incubated at 37°C for 5-8 days until 

15 CPE was evident in the cell monolayer and/or expression of 
DEN virus- specif ic antigens was identified in an aliquot 
- of the cell monolayer scraped from the flask using DEN 
virus-specific mouse hyperimmune ascitic fluid or 
monoclonal antibodies in indirect immunofluorescence 

20 tests. 

RESULT S 

Replication of DEN-2 16681 Virus: 

25 DEN -2 16681 virus replicates to high titer in cell 

culture. The CDC virus seed used in this study contained 
2.0 X 10 7 plaque forming units (PFU)/ml. This titer was 
determined by plaque titration of the seed virus in 
monolayer cultures of Vero cells. This seed titered 1.3 X 

30 10 4 PFU/ml in LLC-MK 2 cells. A growth curve for this virus 
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was determined in CS/36 Aedes albopictus cell culture 
(Figure 4) . This level of replication is quite high for a 
flavivirus. The DEN- 2 16681 virus is eminently suitable 
to serve as the parent to. an infectious cDNA clone of DEN 
5 virus 

The DEN -2 PDK-53 vaccine virus, taken directly from a 
vaccine vial obtained from Mahidol University, contained 
3.4 X 10 4 PFU/ml of virus, as titrated in Vero cell 
monolayers, and 1.5 X 10 4 PFU/ml as titrated in LLC-MK 2 
10 cell monolayers. 

RT/PCR Amplification and Cloning of DEN- 2 16681 eDMA ? 

The entire genome of DEN- 2 virus, parental strain 
15 16681, was amplified from genomic RNA in the form of 5 

cDNA clones of various sizes (T7-F1, F2, F3 , F4 , and F5 ) . 
PCR amplification with 5 sets of upstream and downstream 
amplimers yielded the predicted amplicon sizes in PCR 
reactions. Figure 5 shows the migration of these cDNA 
2 0 fragments in agarose gels. 

Recombinant plasmids, obtained by ligating the- cDNA 
amplicons into the pCRII TA-vector, were extracted from 
minicultures derived from transformed E. coli XLl-Blue 
colonies- Uncut plasmids were screened for the presence 
25 of cDNA insert by comparing their mobility in agarose gels 
with the mobility of uncut wild- type pCRII vector plasmid. 
Selected plasmids were then restricted with the 
restriction enzyme EcoRI to confirm the size of the 
inserted cDNA fragment. EcoRI digests of F2-Sal, Sal~F2, 
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and F3 plasmids derived from independent transformed 
bacterial colonies are shown in Figure 6. 

The following 15 DEN- 2 16681 virus-specific cDNA 
clones, shown schematically in Figure. 7, were selected for 
nucleotide sequence analysis: 

RT/PCR 



Clone 


amplicon. 




l 


Fl 


A8 


2 


Fl . - 


A21 


3 


Fl 


A25 


4 


Fl 


A26 


5 


F2-Sal 


AA2-4 


6 


F2-Sal 


AA2-8 


7 


Sal-F2 


AA3-3 


8 


Sal-F2 


AA3-4 


9 


F3 


AA4-4 


10 


F3 


AA4-6 


11 


F4 


10 


12 


F4 


12 


13 


F5 


AA6-1 


14 


F5 


AA6-2 


15 


F5 


AA6-4 



RT/PCR Amplification and CI on in a- of n KN . 2 pr> Tr-^ „nx J A . 

The entire genome of DEN- 2 virus, vaccine strain PDK- 
53, was amplified from genomic RNA in the form of 23 cDNA 
clones of various sizes. Even though the PDK-53 vaccine 
contained only about 10* PFU/ml of virus, we were able to 
routinely amplify cDNA from RNA that was extracted 
directly from this seed virus. To accomplish this, we 
routinely use the "extended PCR method", incorporating the 
Taq extender reagent (Stratagene) in the PCR reactions. 
We had previously shown that the Taq extender 
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significantly enhanced yields of large molecular weight 
amplicons in the PCR amplification of the nonstructural 
genes of the flavivirus, St . Louis encephalitis virus 
(Figure 8A) . For extended PCR reactions, reaction 
5 mixtures 'were made as for standard PCR reactions, but the 
standard PCR buffer was replaced with the Taq extender 
buffer and 1 unit of AmpliTaq DNA polymerase (Perkin- 
Elmer) and 1 unit of the Taq extender enzyme per kbp of 
expected amplicon size was included in the reaction. 

10 Figure 8B shows the correct agarose gel migration of large 
cDNA amplicons Fl (containing the T7 RNA polymerase 
promoter at the 5 1 end of the mRNA- sense strand of the 
amplicon) , F2 , and F3 obtained by PCR amplification using 
DEN- 2 PDK-53 viral genomic RNA as template. The standard 

15 PCR reaction also worked for a number of DEN- 2 PDK-53 
amplifications . 

The PDK-53 PCR products were cloned into the pGEM-5Zf 
TA- vector (Promega) or the pT7Blue (R) TA-vector (Novagen) . 
Although we seemed to have the best cloning efficiency of 

20 PCR amplicons in the pCRII TA-vector, the other vector 
kits were less expensive and worked well. The cloning 
efficiency of PCR products into, the TA-vector decreased 
rapidly as amplicon size increased beyond 2000 bp. 

25 The following 23 DEN -2 PDK-53 virus-specific cDNA 

clones were selected for nucleotide sequence analysis: 
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Expected 





CLONE 


RT/PCR ' 
AMPLTCON 


Amp 1 icon 
length 


Up-AmDlimer 


Down-Amplimer 


5 


1 


F-5 


1552-bp 


D2-SMT71 


CD2-1510 




2 


Fl-7 


ii 




it 




3 


Fl-9 


n 




ti 




4 


F1-75A 


it 








5 


F1-79B 






it 


10 


6 


F2-14 


3355-bp 


D2-1261 


CD2-4615 




7 


F2-16B" 


ti 




ti 




8 


F3-33 


2676-bp 


D2-4257 


CD2-6932 




19 


F3-3C 


n 




tt 




10 


F4-9 


2373-bp 


D2-6493 


CD2-8865 


15 


11 


F4 . 9-22 


2937-bp 


D2-6493 


CD2-942 9 




12 


F4. 9-53 


ii 




it 




13 


F4. 5-1 


1897-bp 


D2-8440 


CD2-103 3 7 




14 


F4.5-2 


ii 




it 




15 


F4 . 5-6 


ii 




ti 


20 


16 


F4 . 5 - 7 


1! 




it 




.... 17 


F5-72 


1914 -bp 


D2-8773 


CD2-10687 .X2 




18 - 


F5-77 


ti 




ti 




19 


F5-78 


ii 




ii 




20 


F3 . 5-4 


1375-bp 


D2-6046 


CD2-7420 


25 


21 


F3.5-6 






it 




22 


F3 . 5-19 


it 




tr 




23 


F3-3K 


2676-bp 


D2-4257 


CD2-6932 



3 0 Nucleotide Secmence An alyses of DEN- 2 16681 cDNA Clones: 
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EcoRI fragments of the 15 DEN -2 16S81 virus -specif ic 
cDNA clones were subcloned into the single-stranded 
bacteriophage M13mpl8 or M13mpl9 for sequencing. 
Sequencing of the entire viral genome was performed 
manually using radioisotopic labeling and exposure, 
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' development, and reading of autoradiographs . The data was 
* ■ read from" the films and entered by hand into a sequence 
data spreadsheet . 

-The locations of observed cDNA artifacts or "errors" 
5 dictated the splicing strategy of subclones to construct 
the full genome-length clone. . If. the nucleotide at a 
particular position of one cDNA clone differed from the 
nucleotides at that same position in 2 or more independent 
clones, then the nucleotide in the first clone was deemed 

10 to be an error. If only 2 cDNA clones were sequenced for 

a given region of the genome and they differed in sequence, 
at a particular position, then if one of the cDNA clones 
agreed with the sequence data of Blok et al. (19 92) , then 
the clone containing the nucleotide that was in agreement 

15 with the latter investigators was deemed to be correct . 

The approximate locations of the cDNA errors identified in 
the 16681 clones are illustrated' in Figure 9. 

The full genome-length cDNA clone of DEN -2 16 6 81 
virus was first constructed in pUC19 . Unfortunately, RNA 

20 transcribed from this clone was not infectious. When 
over 90% of the full-length cDNA in the clone was 
resequenced, it was determined that several mutations had 
occurred during splicing and cloning manipulations of the 
subclones in E. coll. One of these mutations was a base 

25 deletion in the NS4B gene. This deletion would cause a 
frameshift of the amino acid sequence, resulting in 
ribosomal translation of a nonsense polypeptide downstream 
of the mutation point. This fatal deletion, by itself, 
would explain the noninfectious nature of the RNA 

30 transcribed from the first full-length clone in pUC19. 
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The final, correct cDNA subclones (Fl-E, F2-E, F3/4/5-F) • 
that were incorporated into the full-length, successfully 
infectious clone of 16681 virus were reanalyzed; by direct 
sequencing of the double- stranded plasmid DNA via the 
5 thermocycling method using the Tag DyeDeoxy Terminator 
Cycle Sequencing Kit. Sequence analysis was performed 
using the automated 373A DNA sequencing machine. The 
color- coded sequence chroma tograms were read by the 
investigator and the data was entered manually into a ' 
10 computer-based spreadsheet. 

We independently confirmed the sequence of the 5'-' 
terminal 32 nucleotides of the DEN- 2 16681 viral genome. 
A 5- -end RNA-cDNA hybrid molecule, made with primer cD2- 
996 and reverse transcriptase, was 3 ' -tailed with dCTP and 
annealed to dGTP-tailed, Pstl-cut M13mpl9 RF DNA. One of 
the resulting Ml 3 clones had a cDNA run-off product 
containing the 5 ' -terminal end of the genome. The 5 ' -end 
sequence was identical to that published for DEN- 2 1409 
(Deubel et al . , 1988) and DEN- 2 16681 (Blok et al . , 1992). 
We have not independently confirmed the sequence of the 
3 '-terminal 36 nucleotides of DEN- 2 16681 virus or the 5'- 
or 3 • -terminal nucleotides of DEN- 2 PDK-53 virus. 

We sequenced uncloned, PCR-derived amplicon cDNA 
fragments directly for the following regions' of the DEN- 2 
25 16681 viral genome: nucleotides 70-260, 330-870, 890- 

1690, 1890-3720, 3770-4050, 4080-4320, and the 3 terminal 
9990-10686. Unlike the sequencing of cloned DNA, direct 
analysis of PCR amplicons provides sequence information 
for the majority population of amplified cDNA molecules, 
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and therefore for the majority population of template RNA 
molecules. 

We observed very early in the project that the 
nucleotide sequence of DEN- 2 16681 virus that we 
5 determined at the CDC laboratory differed significantly 
from the sequence of DEN- 2 16681 virus as published by 
Blok et.al. (1992). Our nucleotide sequence, differed from 
that published by Blok et al . (1992) at 60 nucleotide 
positions, which were located throughout the genome. 
10 Amino acid substitutions were encoded by 26 of these 

nucleotide differences. The approximate genomic locations 
of the nucleotide differences are illustrated in the 
schematic diagram in Figure 10. The exact nucleotide 
positions of the discrepancies are shown in Figure 11. 

15 

Nucleotide Sequence Analyses of DEN- 2 PDK-53 cDNA Clones: 

The DEN- 2 PDK-53 virus-specific cDNA clones were 
analyzed by direct sequencing of the double -stranded 

2 0 plasmid DNA by the thermocycling method using the Taq 
DyeDeoxy Terminator Cycle Sequencing Kit. The 3 1 -end 
sequence from nucleotide position 10290-10686 was also 
determined by direct sequencing of PCR-derived amplicon 
cDNA. Sequence analysis was performed using the automated 

25 373A DNA sequencing machine. The color-coded sequence 

chromatograms were read by the investigator and the data 
was entered manually into a computer-based spreadsheet. 
The approximate locations of the cDNA errors identified in 
the PDK-53 cDNA clones are illustrated in Figure 12. 
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Our determination of the nucleotide sequence of DEN- 2 
PDK-53 virus differed significantly from the PDK-53 
genomic sequence published by Blok et al ... (1992) . The - 
latter investigators reported a total of 53 nucleotide 
5 differences that encoded 27 amino acid mutations between • 
the nucleotide sequences of the genome of DEN -2 16681 
virus and that of its vaccine derivative,. PDK-53 virus. 
They reported the following nonsilent mutations: 1 in the 
capsid, 2 in prM, 1 in M, 3 in E, 3 in NS1, 3 in NS2A, 2 

10 in NS2B, 3 in NS3 , 3 in NS4A, 3 in NS4B, and 3 in NS5 . We 
detected only 8 nucleotide mutations between the genomes 
of these two virus strains. One mutation occurred in the 
5 ! -NC region of the genome, while 7 nucleotide mutations, 
4 of which encoded amino acid substitutions, occurred in 

15 the coding region of the genome as shown in Figure 13 and 
the following table. 



WO 96/40933 



PCT/US96/09209 



64 



Table: Summary of nucleotide differences between the 
genomes of DEN- 2 16681 virus and its vaccine 
5 derivative virus, strain PDK-53. 



Genome 



10 



Genome 



P ositio n 



gene 



Nucleotide 



16681 PDK-5 3 



Amino Acid 



16681 PDK-5 3 



15 



20 



57 a 
524 a 
2055 a 
2579 a 
4018 
5547 
6599 a 
8571 a 



5 1 -NC 

prM-29 

E-373 

NS1-53 

NS2A-151 

NS3-342 

NS4A-75 

NS5-334 



C 
A 
C 
G 
C 
T 
G 
C 



T 
T 
T 
A 
T 
C 
C 
T 



Asp 
Phe 
Gly 
Leu 
Arg 
Gly 
Val 



Val 
Phe 
Asp 
Phe 
Arg 
Ala 
Val 



a 16681 vs. PDK-53 difference agrees with Blok et al . 
(1992) 



25 The few nucleotide positions where our data and those 

of Blok et al . (1992) agreed, in terms of sequence 
differences between the 16681 and PDK-53 viral genomes, 
were distributed throughout the genome . The entire genome 
of DEN -2 16681 virus was cloned and sequenced before we 

3 0 received the PDK-53 vaccine virus at our laboratory. 
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Except for -the 3' -terminal cDNA clones #17-#19, every PDK- 
53 virus-specific cDNA clone constructed in our laboratory 
contained at least one nucleotide position of 16681/PDK-53 
sequence difference confirmed by both ourselves . and Blok 
5 ■ et al. .(1992). Therefore, our PDK-53 virus-specific cDNA 
clones did not result from contamination of PDK-53- 
specific PCR reactions with 16681 virus-specific cDNA 
template. Our. PDK-53 virus-specific cDNA clones, which 
also contained the many sequence discrepancies between our 

10 data and those of Blok et al. (1992), encoded the 

nucleotide sequence from the 5 ! terminus to nucleotide 
position 10337 of, the genome of PDK-53 virus. ' The 3 1 - 
terminal 387 nucleotides (10337-10723) of DEN- 2 PDK-53 
virus were identical to those of the parental 16681 virus. 

15 Since, none of the PDK-53 virus-specific cDNA clones 

covering, this region of the genome contained a point of 
confirmed 16681/PDK-53 sequence difference, we repeated 
the PCR amplification of the 3 1 terminus of the PDK-53 
virus genome. This was done to ensure that the 3'- 

20 terminal cDNA clones #17-#19 did not result from PCR 
reactions contaminated by 16681 virus-specific DNA 
template. The PCR reaction components were pipetted in a 
room in which DEN cloning had not been performed 
previously, using new micropipetors , newly opened pipet 

25 tips with aerosol barrier, and freshly made stock 

reagents. Direct sequencing of the resulting double- 
stranded PCR cDNA amplicon confirmed that the 3' -387 
nucleotides of DEN -2 PDK-53 virus was indeed identical to 
the 3' terminus of the 16681 parent. 
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• ■ 'The finalized- nucleotide sequence of DEN- 2 virus, 
. . strain 16681, including, the nucleotide and amino acid 
mutations ideritified for DEN -2 PDK-53 virus, is shown in 
Figure • 14 . 

5 

Construction of DEN-16681 Full-Length Clone in pUC19 : 

For the construction of the full genome -length cDNA 
clone of DEN- 2 16681 virus, 5 of the' sequence- 

10 characterized PCR-amplif ied cDNA subclones were selected 
for splicing.. However, clone #5 contained a cDNA "error" 
that was not readily spliced out with the existing clones. 
This error, which was a C-to-T mutation at nucleotide . . 
position 1730 and encoded a nonsilent Thr-to-Ile amino 

15 acid substitution at E-265, was incorporated into the F2 
construct. . The intermediate F2 construct was the result 
of splicing the F2-Sal clone (#5) Sphl/Hpal ' fragment to 
the Sal-F2 clone (#7) Hpal/Kpnl fragment in the MCS of 
plasmid pUC18 (Figure 15) . To correct the error, a new 

20 PCR amplicon was made using primers D2-1261 and CD2-2955. 
Resulting clones in the TA-vector were sequenced, and the 
correct- Sphl/Hpal fragment of a new clone was substituted 
• for the. faulty Sphl/Hpal fragment of the original F2 
construct (Figure 16) . The corrected F2 clone was 

25 designated F2-C. 

The relevant cDNA clones of DEN- 2 16681 virus, were 
spliced: together via a series of intermediate ligation 
products in the MCS of pUC18 to yield Fl/3/4/5, which 
contained all of the genome except for the Sphl-Kpnl 13 80- 

3 0 4493 region present in clone F2-C. Multiple attempts to 
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ligate the F2-C Sphl/Kpnl cDNA fragment into Fl/3/4/5 in 
pUC18 failed. The cDNA insert of Fl/3/4/5-pUC18 was then 
transferred to the MCS of pUC19, resulting in Fl/3/4/5- 
pUC19. This operation simply reversed the orientation of 
5 the cDNA insert within the context of the pUC plasmid. 
Ligation of Sphl/KpnI-cut Fl/3/4/5-pUC19 and F2-C 
Sphl/Kpnl insert readily . yielded trans formants in E . coli 
Xll-Blue that contained the full-length cDNA clone 
Fl/2/3/4/5-pUC19, which was designated pD2/IC-20. The 
10 detailed splicing procedures, for pD2/IC-20 are illustrated 
in Figure 17. The orientation-specific cloning of the 
full genome -length cDNA in pUC19 rather than pUG18 is 
diagrammed in Figure 18. 

The full genome- length cDNA of DEN- 2 16681 virus was 
15 cloned into the MCS of pUCl9. Apparent full genome -length 
viral mRNA was transcribed from linearized pD2/IC-2 0. 
This transcribed product failed to yield infectious virus 
following electroporation of BHK-21 cells. Most of the 
cDNA in the pD2/IC-20 clone was reseguenced, and several 
cloning artifacts, including a fatal single-nucleot ide 
deletion, were identified. Original subunit intermediate 
cDNA constructs in pUC18 were resequenced to confirm that 
they possessed the correct sequence and corrected where 
necessary. The corrected primary cDNA clones Fl, F2-C, 
and F3/4/5 were then ligated into the low-copy plasmid 
pBR322, rather than the high copy-number pUC18 plasmid. 
It was envisioned that; the cDNA would be more stable in a 
slower-replicating plasmid in E . coli. 

To enable more straightforward cloning into pBR322, 
the MCS of pUC19 was spliced into the pBR322 plasmid 
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..(Figure 19). This resulted in plasmids pBRUC-138 and 
pBRUC-13 9 containing the pUC MCS in both orientations 
within the pBR322 plasmid backbone. The SphI site was 
remoyed from both pBRUC plasmids by cutting with. SphI, 
5 blunt/ending of the cut ends using T4 DNA. polymerase, and 
then ligating the ends back together. This was necessary 
for the construction of the full-length cDNA clone because 
SphI is one of the cDNA restriction/splicing sites for the 
clone. 

10 The F3/4/5-F cDNA clone of DEN- 2 16 681 virus, which 

had been verified by sequence analysis, was cloned into 
pBRUC-139 (SphI*) (Figure 20) . Following this ligation, 
the Fl-E and F2-C cDNA clone fragments were also moved 
into the pBR3 22 backbone to construct the full genome - 

15 length cDNA clone, pD2/lC-30P (Figure 20). This 

recombinant plasmid was replicated successfully in both 
TB-1 and MC-1061 strains of E . coli. 

Construction of DEN- 2 PDK-53 I nfectious cDNA Clone; 

20 

. The full-length infectious clone of DEN- 2 16681 virus 
was used in the construction of the infectious clone for 
PDK-53 virus. Since the 3 1 -noncoding regions of the 
genomes of both viruses are identical, and the amino acid 

2 5 sequences of the translated precursor polyproteins encoded 

by. genome nucleotide positions 6646-10269 are identical in 
both viruses, the infectious clone of PDK-53 virus was 
constructed using the 16681 3 ' -end cDNA from the Nhel site 
at nucleotide position 6646 to the 3' terminus of the 

3 0 genome (Figure 21) . After correcting a cDNA error in the 
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PDK-5 3 F3-3C subunit clone, this fragment and the F2-16B 
cDNA fragment were ligated into the infectious, clone 
backbone to construct the DEN -2 PDK-53 virus-specific 
full-length cDNA clone, pD2/IC-130V (Figure 21) .. 

5 . . " 

Transcription of Viral mRNA from DEN- 2 Infectious cDNA 
Clpneg: 

Viral genomic RNA extracted from gradient-purified 

10 virions was analyzed by nondenaturing RNA agarose gel 

electrophoresis to observe the level of RNA degradation 
. and the limits of detectability by ethidium bromide 
staining. Figure 22 shows an agarose gel electropherogram 
for 22-383 ng of viral genomic RNA obtained from purified 

15 preparations of wild-type DEN- 2 16681 virus and wild-type 
Venezuelan equine encephalitis (VEE) virus, strain 
Trinidad donkey. Although degradation of the RNA is 
visible as a spectrum of smaller molecular weight nucleic 
acid (smear in Figure 22) , definite full-genome length RNA 

20 bands are clearly visible. This smear of nucleic acid is 
probably also due, in part, to multiple conformations of 
the single-stranded RNA molecules which migrate through 
the gel at different rates. The relative gel migration of 
the single-stranded RNA does not correlate directly with 

25 the sizes of the double-stranded molecular weight marker 

DNA bands (MW, Figure 22); the VEE and DEN- 2 viral genomes 
are 11,447 and 10,723 nucleotides in length, respectively. 
BHK-21 and C6/36 cells were transfected successfully by 
electroporation with 2000, 500, 100, 10, 1, and 0.1 ng of 

3 0 viral genomic RNA extracted from purified VEE or DEN -2 
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16681 virus, as indicated by development of CPE, 
expression of viral proteins detected by indirect 
immunofluorescence tests using virus -specific antibody, 
and/or by plaque titration of infectious virus from the 
5 transf ec'ted-cell culture medium. RNA quantities of 1 ng 
or less were essentially undetectable in the ethidium 
bromide- stained agarose gel system we used. Therefore, 
authentic RNA transcripts derived from full genome- length 
cDNA and visualized in agarose gel electropherograms of 
10 transcription reactions should be infectious for BHK-21 
cells by electroporation. 

Investigators previously constructed an infectious 
cDNA clone for VEE virus as reported by Kinney et al . 
(1989) . RNA transcription reaction conditions that 
15 yielded high quantity and quality of infectious mRNA 

transcripts from the pVE/IC-92 infectious clone of VEE 
virus failed in multiple, attempts to transcribe RNA- from 
. the pD2/IC-20 clone of DEN- 2 16681 virus. Figure 23 shows 
an agarose gel electropherogram that demonstrates 
2 0 successful transcription of RNA from* the VEE clone, but 
not pD2/IC-20. 

In an attempt to improve RNA transcription from the 
DEN -2 clone, commercial transcription kits were purchased. 
The Megascript transcription kit. supplied by Ambion also 
25 failed to transcribe RNA from the DEN clone. However, the 
Ampliscribe kit obtained from Epicentre Technologies 
enabled efficient transcrip- tion of RNA from the DEN- 2 
clone (Figure 24) . 

The success of the Ampliscribe kit apparently was due 
30 to the high concentration of ribonucleotides and a very 
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high, but proprietary, concentration of- T7 RNA polymerase. 
.The RNA transcribed from pD2/IC-20 was not infectious. 
However, viral mRNA transcribed from DEN- 2 16681 clone 
pD2/2-IC30P and PDK-53 clone. pD2/IC-130V was infectious 
5 . (Figure 25) . 

• Viral mRNA transcripts from both replicates of 
pD2/IC-30P (A and D) and pD2/IC-130V (F and J) were 
infectious, producing viable infectious virus in 
electroporated BHK-21 cells . Figure 26 shows RNA 
10 transcripts from pD2/IC-20, pD2/IC-30P, and pD2/IC-130V. 

Construction of DEN- 2 16681 /PDK-53 Chimeric cDNA Clones: 

Several chimeric full-length cDNA clones were derived 
15 from the pD2/IC-30P and pD2/IC-130V clones. . All clones 

were constructed in the pBRUC-13 9 derivative of the pBR32 2 

plasmid vector. E. coli strains XLl-Blue, MC-1061, and 

TB-1 were successfully transformed with ligated 

recombinant plasmids containing full genome-length cDNA. 
20 Viable virus was derived from all of the indicated clones. 

The evolutionary tree for the chimeric viruses is 

diagrammed in Figure 27. 

Details concerning the splicing strategies for the 

chimeric clones are shown in Figure 28. Appropriate cDNA 
25 fragments were cut and ligated together at the internal 

Sail, SphI, Kpnl, and Nhel sites as well as at the S'-SstI 

and 3 1 -Xbal sites. 

Viable prototype and chimeric viruses were derived 

from each of the clones indicated in Figure 28- by 
3 0 electroporation of BHK-21 cells with viral genome-length 
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mRNA transcribed from linearized plasmids. Seed stocks of 
these viruses were prepared by centrifuge-clarification of. 
the cell culture medium, adjustment of the FBS 
concentration to 10%, and freezing of seed aliquots at 
5- ' -70°C. . Virus concentrations were determined . by plaque 

titration of the virus seeds in monolayer cultures of V.ero 
cells. The results of these virus titrations are shown in 
the following table. 
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Table. Plaque titration of DEN -2 16 68.1 and PDK-53 

stock seed viruses and chimeric viruses 
recovered from BHK-21 cells transfected 
with infectious clone -derived viral mRNA 
. transcripts . 



Virus . ! LEFIlZmU Genotype 8 

DEN- 2 16681 8 . 0 X 10 7 c D F G L R G V 

DEN -2 PDK-53 . 5.1 x 10 3 t V . D.F . A. 



D2/IC-30P-A 3.6 X 10 5 

15 D2/IG-30P-A2 1.7 X 10 s 



D2/IC-130V-F 4.0X10 5 tV.DF.A 

D2/IC-130V-J 2.2 X 10 s t V . D F . A 

20 D2/IC-130V2-1 2.8 X 10 s t V .... A 

D2/IC-130V2-7 8.8 X 10 4 t V . . . .A 

D2/IC-31-12 2.1 X 10 s t V . . . . . 

D2/IC-31-15 3.2 X 10 s t V ..... . 



D2/IC-32-A 1.4 X 10 6 . . . D F 

D2/IC-32-G 1.2 X 10 6 . D F 



D2/TC-33-C 9.6 X 10 4 ...... A 

30 D2/IC-33-P 1.9 X 10 5 ....... A 

D2/IC-321-L 1.1 X 10 s t V . D F . . 

D2/IC-321-N 7.6 X 105 t V . D F . . 

\ 35 D2/IC-323-B 7.2 X 10 s . . . D F . A 

^ D2/IC-323-I 8.8 X 10 5 . D F . A 



D2/IC-31-57-5 2.4 X 10 5 t". 

40 D2/IC31-524-D 3.2 X 10 4 c V 



a Genotype is designated in small case. for the 
virus-specific 5'-noncoding nucleotide and in 
upper case single-letter amino acid abbreviation 
45 for amino acids encoded by virus -specific 

nucleotide mutations. Dots represent nucleotide 
or amino acid sequence identity with DEN- 2 16681 
virus . 
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To establish the validity of "the clone-derived 
chimeric viruses, relevant genomic cDNA fragments were 
amplified directly from seed viruses by PGR and spot- 
sequenced . The results are shown in Figure 29. This 
5 validation process is ongoing. Except for D2/IC-31-524 

virus,, appropriate cDNA insert regions in chimeric viruses 
have been, confirmed by sequence analysis. Except for. 
D2/IC-30P, D2/IC-130V, arid D2/IC-31-57, which have been 
fully confirmed/ clone-derived chimeric viruses have yet' 

10 to be spot -sequenced in a. recipient clone-derived cDNA 

region to definitely establish the chimeric nature of the 
virus. The recipient clone is the recombinant plasmid 
backbone into which a cDNA fragment, the insert fragment, 
from a heterologous donor clone is spliced. . Where 

15 duplicate clone-derived viruses were obtained, both 

viruses of a given genotype were spot -sequenced, and both 
gave the same result, which is shown in Figure 29. 

Submission of PD2/IC-30P and pD2/IC-130V to ATCC: 

20 . 

Patent deposits of the full genome- length cDNA clones 
of DEN- 2 16681 and PDK-53 viruses were submitted to the 
American Type Culture Collection (ATCC) , Rockville, 
Maryland, U.S.A. Both pD2/lC-30P-A and pD2/IC-13 0V- F were 
25 grown overnight in E. coli TB-1 cells. Six cryogenic 
vials containing 1 ml each of frozen cell culture in 
10% glycerol were submitted by dry ice shipment. Prior to 
shipment, plasmid was extracted from a 1 ml aliquot of 
each virus-specific culture. The recombinant full-length 
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plasmid was recovered from the cells' as shown in Figure 
30. 

The pD2/IC-3 0P-A deposit with the ATCC was assigned 
. accession number ATCC 69826 , and the pD2/IC-130V-F deposit 
5 with the ATCC was assigned accession number ATCC 69825. 
Date of deposit was May 25, 1995. 

Construction of Chimeric DEN-2 /1, -2/3. and -2/4 
Infections qipneg; 

10 We contemplate deriving chimeric DEN-2/1, DEN-2/3, 

and DEN-2/4 viruses from recombinant full genome -length ' 
cDNA clones containing the genetic background of DEN- 2 
PDK-53 virus and the prM and E genes of the DEN-1, DEN - 3 , 
and DEN- 4 candidate vaccine viruses, respectively. To 

15 accomplish this, the prM and E genes of the vaccine 

viruses were amplified by PCR. Because our laboratory has 
been establishing a sequence database to analyze the 
molecular epidemiology of several f laviviruses , including 
all of the serotypes of dengue virus, the primers used for 

20 cDNA amplification in the PCR were readily available at 
our laboratory. The amplified cDNA molecules were 
sequenced directly, thus providing the sequence of the 
population of virions in the virus seed. The amplified 
cDNA amplicons for the DEN-1,. DEN- 3 , and DEN- 4 vaccine 

25 viruses have all been cloned into the pGEM-5Zf TA- vector. 
The cloned cDNA has not been analyzed by sequencing, since 
it will be necessary to rederive the cDNA amplicons by PCR 
to incorporate appropriate RENZ cleavage sites within the 
amplicon for splicing into the full-length cDNA backbone 

30 of DEN- 2 PDK-53 virus. The partial nucleotide sequences 
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of the genomes of the DEN- 1 , DEN- 3, and DEN- 4 vaccine 
viruses were aligned with the DEN- 2 PDK-53 sequence. All 
four .sequences are aligned with the nucleotide sequence- of 
DEN- 2.. 16681 virus and its deduced amino acid sequence in 
Figure 31.. The deduced amino acid sequences of the DEN 
viruses are aligned in Figure 32. 

It is readily evident from the aligned nucleotide 
sequence data that useful restriction enzyme sites in the 
DEN -2 virus-specific cDNA are not conserved in the DEN-1 , 
DEN- 3 , and DEN- 4 viruses. Therefore, splicing sites must 
be engineered into the cDNA to enable the splicing of 
heterotypic DEN-1, DEN- 3 , and DEN- 4 prM and E genes into 
the DEN- 2 backbone. It is not yet clear precisely how the 
nonstructural proteins of flaviviruses interact with the 
.15 structural proteins during intracellular maturation of the 
virus. Furthermore, the interaction" of the capsid protein 
with the genomic mRNA molecule in the nucleocapsid of the 
virion has not been defined. However, coexpression of the 
E and prM proteins has been more successful than 
20 expression of E alone in expression systems in vitro. The 
DEN- 2 nonstructural proteins are involved in all virus- 
specific intracellular polyprotein processing and 
replication of viral mRNA, and the predominant portion of 
the mRNA genome interacting with the capsid protein is 
25 presumably, but not necessarily, DEN- 2 virus-specific. 
For these reasons, our strategy is to splice in the prM 
and E -genes of DEN-1, DEN- 3 , and DEN- 4 viruses very 
precisely, while maintaining the DEN- 2 context of the 
bracketing capsid and NS1 protein regions. 
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The strategies for creating Xhol -and Xbal splice 
sites at the 5« end of the prM gene and near the 3- end of 
the E gene are illustrated in detail in Figures 33 . and 34, 
respectively. Briefly, mutagenic primers containing the 
appropriate RENZ site are utilized in PCR reactions to 
synthesize new cDNA for the prM and E genes of all four 
viruses. A DEN -2 PDK-53 virus-specific cDNA cassette 
plasmid, designated pD2V-CAS12, containing the genome 
region from the 5 1 terminus through nucleotide position 
4696 is constructed via intermediate plasmid constructs 
pFl-Xho and pF2-Xba as illustrated in Figures 35 and 36. 
The Xhol/Xbal cDNA fragments cut directly from DEN- 1 , DEN- 
3/ and DEN- 4 virus-specific amplicons synthesized by PCR 
using the mutagenic primers are ligated into the pD2V- 
15 CAS12 cassette plasmid to create subclone chimeras. The 
Sstl/Kpnl fragment of the resulting pDlV-CAS12, pD3V- 
CAS12, and pD4V-CAS12 cassettes are moved into pD2/IC- 13 OV 
restricted with Sstl/Kpnl to create the chimeric full 
genome -length cDNA clones (Figure 3 6) . 
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Discussiort i 

Infectious cDNA clones permit the directed 
engineering of viral genomes. Depending on their 
viability in terms of ability to replicate in cell 
culture, infectious clone-derived viruses can be modified 
by incorporating point mutations, multiple mutations, 
deletions, gene regions of related or heterologous 
viruses, or nonviral genes. Infectious cDNA clones have 
been developed for many RNA viruses, including 
flaviviruses DEN- 4 (Lai et al . , 1991), yellow fever (Rice 
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et al., 1989), Kun j in (Khromykh and Westaway, 1994), ' 
-Japanese encephalitis (Sumiyoshi et al., 1992), and TBE 
(unpublished data). We describe herein the development of 
infectious cDNA clones for DEN- 2 16681 virus and its 
5 candidate vaccine derivative, strain PDK-53 . We also 
describe the construction of chimeric viruses, 
incorporating the prM and E genes of candidate DEN-1, DEN- 
3, and DEN -4 vaccine viruses within the genetic background 
of the DEN- 2 PDK-53 vaccine virus. 

10 Although the candidate vaccine viruses developed at 

Mahidol University are currently the best live DEN virus 
vaccine candidates in terms of immunogenicity and safety 
in adult humans, the DEN-1, DEN-3, and DEN- 4 vaccine 
viruses replicate poorly in cell culture and possess low 

15 infectivity in humans, requiring up to 2000-fold more PFU 
of virus to infect and immunize humans than is needed for 
the DEN- 2 PDK-53 vaccine virus. The low infect ivit ies of 
these viruses have significant implications for vaccine 
production in cell culture, potentially decreased 

20 immunogenic efficacy, and more rapid inactivation under 
conditions of a poorly maintained cold chain in tropical 
countries where dengue viruses are endemic. 

The purpose of engineering chimeric DEN vaccine 
viruses is to enhance the replicative ability and 

25 immunogenicity of the DEN-1, DEN-3, and DEN- 4 vaccine 
viruses. A primary assumption has been that the 
attenuated DEN- 2 PDK-53 vaccine virus replicates to 
appropriate levels in cell culture. In fact, it does 
appear that the genome of DEN- 2 PDK-53 virus is eminently 

30 suited to serve as the genetic backbone for chimeric 
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viruses containing the prM" and E. genes of DEN- 1 , DEN- 3, 
and DEN -4 vaccine viruses.. We have recently completed 
growth curves for DEN -2 16681 virus, DEN- 2 PDK-53 virus, 
and their infectious; clone derivative viruses in LLC-MK 2 
cells. 

The viruses were titrated in Vero cell monolayers. 
These data are shown in the following table: 
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Virus 



Maximum 

Titer 
(PFU/ml) 



Maximum 
Titer 



DEN- 2. 16681 


2.6 


X 


10 8 


10 


D2/IC-30P-A 


1 . 7 


X 


10 7 


8 


D2/IC-30P-A2 


6 . 6 


X 


10 7 


7 > 


DEN- 2 PDK-53 


3.8 


X 


10 7 


9 


D2/IC-130V-F 


2 . 9 


X 


10 7 


7 


D2/IC-130V-J 


1.7 


X 


10 7 


7 



30 



The DEN -2 PDK-53 virus and its infectious clone derivative 
viruses grow to approximately 10 7 PFU/ml in LLC-MK 2 cells, 
about as well as the DEN- 2 16681 virus. 

A second assumption is that the chimeric DEN viruses 
will be viable and the' DEN- 2 PDK-53 virus-specific 
replication machinery will significantly increase 
replication of the chimeric viruses in cell culture and 
increase their infectivity and immunogenicity in humans 
relative to the wild-type vaccine viruses. The high 
degree of conservation of amino acid sequences among the 
polyproteins of the four DEN viruses should ensure that 
the chimeric viruses will be viable. The level of 
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replication attained by the chimeric DEN. viruses is 
determined empirically, as was determined for the DEN- 2 
. PDK-53 'infectious clone derivative virus. 

Bray et al. (1991) constructed chimeric DEN-4/1 and 
5 -DEN-4/2 viruses that appeared to appropriately' express 

DEN- 1 and DEN- 2 structural protein antigens in the genetic 
background of DEN -4 virus . These investigators spliced 
much of the 5 1 -noncoding region, and the capsid, prM and E 
genes of DEN-1 or DEN- 2 virus into the full-length cDNA 
10 clone of DEN -4 virus. The near 3' -terminal splice site 

they chose in the E gene is very close to that proposed by 
us in our project. These chimeric viruses replicated very 
slowly relative to the wild-type viruses. The authors 
attributed this slow replication to possible suboptimal 
15 gene expression, assembly, and/or maturation due to 

incompatibility of heterotypic genes or RNA packaging in 
the nucleqcapsid. Another possibility is that cDNA errors 
may have been incorporated into their constructs . In 
• contrast, Pletnev et al . (1993) engineered chimeric 
20 viruses between DEN- 4 virus and tick-borne encephalitis 
(TBE) virus, which is a very distant flavivirus relative 
of r.DEN viruses. Thus, DEN virus chimeras may be derived 
that are viable. 

A third assumption is that. our chimeric DEN viruses 
25 will express the appropriate structural protein antigens 
, of DEN-1, DEN- 3 , and DEN - 4 viruses, and that vaccinees 
■ will respond with development of appropriate serum titers 
of DEN-1, DEN- 3 , and DEN- 4 neutralizing antibodies 
following immunization with the chimeric viruses. We 
3 0 describe the insertion of the prM and E genes of DEN-1, 
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DEN- 3, and DEN- 4 viruses into the DEN- 2 clone. ' Thr-to-Ser 
amino acid substitutions near .the amino terminus of the 
prM protein in DEN-2", " DEN-2/1, DEN-2/3, and DEN-2/4 
viruses resulting from mutagenesis to create the Xhol site 
5 of the cassettes should be conservative in nature and. 
affect the phenotype of derived .viruses minimally, if at 
all. Alternatively, a unique Mlul site (ACGCGT) could be 
created via a single,, silent A-to-G point mutation at 
nucleotide position 453 in the DEN- 2 clone. The Mlul site 
10 immediately preceding the T7 promoter could easily be 

eliminated by cutting the clone with Mlul, blunt-ending, 
and religation. The clone-derived DEN-2 and chimeric 
viruses would then have the prM' amino- terminal sequence 
" FHLTTR . " 

15 The carboxyl-terminal 24' amino acids of the E 

■ glycoprotein of all of the infectious clone-derived 
viruses will be those of the DEN-2 PDK-53 virus. 
Therefore, the E protein of all of the chimeric viruses 
will have amino acid mutations in this region. Yet, the 

20 carboxyl-terminal 3 9 amino acids of the DEN virus E 

protein comprise membrane- spanning, transmembrane domains. 
In all enveloped- viruses, the transmembrane domains of the 
integral viral proteins of related viruses are quite 
variable in amino acid sequence . It has often been noted 

25 that the important conserved, feature of amino acids in 
this domain lies in their hydrophobic, " lipid-loving" 
nature rather than in the absolute sequence. Creation of 
'a Mrol site (TCCGGA) or a unique Agel site (ACCGGT) at 
nucleotide positions 2281-2286 in the DEN-2 clone would 
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result in amino acids "SG" or "TG" , respectively, at 
positions E-449 and E-450 in the clone-derived viruses. 

The E protein of all flaviviruses share a similar 
gross tertiary structure that is indicated by the absolute 
5 conservation of the 6 Cys residues in the prM protein and 
in the 12 Cys residues in the ectodomain (the .region 
located. on environment side of the viral lipid envelope) 
of the E protein of DEN, Japanese encephalitis, West Nile, 
Murray Valley encephalitis, St. Louis encephalitis, 
10 Kunjin, yellow fever, TBE, Langat, and Powasson 

flaviviruses (data not shown) . Cys residues are involved 
in intrachain Cys -Cys disulfide bonds that determine the 
overall structure of the protein. We fully expect the 
DEN-2/1, DEN-2/3, and DEN-2/4 chimeric viruses to be 
- 15 viable and to replicate more efficiently than the wild-' 
type DEN- 1 , DEN- 3 , and DEN -4 vaccine viruses, 
respectively. Furthermore, chimeric recombinants 
involving the genetic backbone of one flavivirus and the 
structural- genes of a variety of different flaviviruses 
2 0 may also be viable, as has been demonstrated for DEN -4 /TBE 
virus recombinants (Pictnev et al . , 1993). Such 
recombinant viruses offer the potential opportunity to 
engineer chimeric vaccine viruses for a number of ' 
f lavivirus-associated diseases within the genetic 
25 background of a single flavivirus. The X-ray 

crystallographic structure of the E glycoprotein of TBE 
flavivirus has recently been published (Rey et al . , 1995). 
This development has significant implications for the 
future design of flavivirus molecular vaccines. 
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A fourth assumption -is that the chimeric DEN viruses 
will retain- the attenuated phenotype of the wild-type- DEN- 
1, DEN- 3 , and DEN -4 vaccine viruses, despite enhanced 
replicative efficacy provided'- by the more efficient 
5 nonstructural genes and 5' and 3.' noncoding regions of the 
DEN -2 PDK-53 virus. This presupposes, that DEN- 2 PDK-53 
virus has attenuating mutations in the noncoding regions 
or in the nonstructural genes and/or that attenuating 
mutations occur in the prM/E region of the genomes of DEN- 

10 l, DEN- 3 , and DEN -4 viruses. Mutations in essentially any 
region of the viral genome .may be capable of attenuating a 
virulent virus. This has been demonstrated for a number 
of viruses including polio virus, VEE virus, and Theiler's 
virus. Noncoding as well as protein coding regions may be 

15 involved in attenuation. Attenuating . mutations in the 

envelope proteins of enveloped viruses are common (Barrett 
et al. , 1990) . 

The nucleotide mutations in DEN- 2 PDK-53. virus at 
genome nucleotide positions 57 (5 1 -noncoding region), 524 

20 (prM) , 2579 (NS1), 4018 (NS2A) , and 6599 (NS4A) may be 
involved in attenuation of the virus. Unless the prM 
amino acid mutation is the only mutation affecting 
virulence of the virus, the DEN- 2 PDK-5 3 genetic 
background, within which the structural genes from 

25 heterologous viruses will be expressed, does itself 
possess genotypic markers of attenuation. We can 
determine the genetic loci involved in the attenuation of 
the DEN- 2 PDK-53 virus by analyzing DEN -2 16681/PDK-53 
recombinant viruses derived from chimeric 16681/PDK-53 
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full-length clones. The E gene of DEN- 2 PDK-53 virus 
contains no attenuating mutations. 

Although investigators have sequenced the structural 
genes of numerous DEN -3 virus strains (e.g., Lanciotti et 
5 al., 1994), none have sequenced the" DEN- 3 16562 virus, 

parent to the DEN - 3 PCMK-3 O/FRhL-3 vaccine virus. After 
determining the sequences of the prM and E genes of this 
virus, we can establish if any amino acid mutations have 
occurred within these genes in the DEN -3 vaccine virus. 

10 By comparison, nucleotide sequence information for the 
parental DEN-1 and DEN- 4 viruses have been determined 
(unpublished data (parental DEN-1 virus) ; Lanciotti et 
al . , submitted for publication (parental DEN - 4 virus)). 
The nucleotide sequences of the E gene of DEN- 4 103 6 virus 

15 and both prM and E genes of DEN-1 1600 7 virus 'have been 
determined. The following amino acid mutations were 
identified: 
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Amino Acid 



E Protein 

Virus Amino Acid Parent Vaccine 

5 type Position Strain Strain 



DEN- 1 E-13 0 Val Ala 

E-203 Glu Lys 

E-204 Arg Lys 

10 E-225 Ser Leu 

E-384 Ala Glu 

E-477 Met Val 

DEN- 4 E^345 Glu Lys 

15 E-364 Val Ala 



There were six amino acid mutations in the E protein of 
DEN-1 16007 PDK>13 virus and 2 mutations in that of DEN -4 
1036 PDK-48 virus. There were no amino acid substitutions 

20 in the prM protein of the DEN-1 vaccine virus. Glu-to-Lys 
and Lys-to-Glu amino acid substitutions, as occur. at DEN-1 
E-203 and DEN-4 E-345,- are common motifs in sequence 
comparisons between parent viruses and their vaccine 
derivatives. It is likely' that the heterologous prM/E 

25 cDNA inserts in recombinant full-length cDNA clones will 
transport genetic loci of attenuation into the chimeric 
DEN-2/1, DEN-2/3, and DEN-2/4 virus derivatives. The 
optimum scenario for the chimeric viruses involves 
increased replication ability in the presence of genetic 

3 0 loci of attenuation in the heterologous DEN-1, DEN- 3 , and 
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DEN- 4 structural gene inserts within the genetic 
background of the DEN- 2 PDK-53 virus. 

- Nucleotide sequence analysis of expressed genes is 
essential. The error rate in the original RT/PCR derived 
5 • cDNA clones of DEN- 2 16.6 81 virus was 8.2 x 10" 4 , that is 1 
cDNA- error for every 1227 nucleotides of cloned, sequenced 
cDNA. In a previous sequencing project involving VEE 
virus and employing classical, non-PCR cDNA synthesis 
methodology, the error rate was calculated to be 3.9 x 10 -4 

10 or 1 error for every 2543 nucleotides of cloned, sequenced 
cDNA. These errors are due to nucleotide incorporation 
errors by reverse transcriptase during first strand cDNA 
synthesis and perhaps to the cloning of individual 
variants within the original population of virions. 

15 Unlike many DNA polymerases, RNA polymerases and reverse 
transcriptase have no editing function. Incorrect 
nucleotides incorporated during strand elongation are not 
detected or removed before continuing. The Taq DNA 
polymerase is also known to incorporate errors into PCR 

2 0 ampl icons'. Thus, at least 4-8 cDNA "errors" can be 

expected to occur in 10 kb of cloned cDNA- We have 
observed the incorporation of spurious in- frame 
termination codons (TAA, TAG, TGA) in cDNA clones derived 
from both VEE and DEN viruses. Premature termination of 
25 amino acid translation would result in a truncated protein 
and would undoubtedly be a. lethal mutation for a candidate 
infectious clone. Much of the utility of genes expressed 
in vitro is compromised when those genes are not 
characterized by sequence analysis. If cDNA errors occur 

3 0 in candidate infectious cDNA clones, it may be difficult 
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to. determine if phenotypic effects of directed mutations 
are due to the engineered mutation, to cDNA errors, or to 
synergistic action or compensation between errors and 
engineered mutations. 
5 Wiktor et al. (1984) reported that two cDNA errors 

caused spurious amino acid substitutions in rabies virus 
glycoprotein expressed in recombinant vaccinia virus and 
resulted, in expression of non-authentic rabies 
glycoprotein. After sequence analysis and correction of 

10 the cDNA, expression of authentic rabies glycoprotein was 
obtained. A faulty cDNA clone may behave as expected in 
one circumstantial context, yet behave very 
inappropriately and be highly misleading in a different 
context. A faulty structural gene cDNA clone of the 

15 - virulent VEE Trinidad donkey (TRD) virus that was 

expressed in recombinant vaccinia virus was essentially 
authentic by monoclonal antibody analysis of expressed VEE 
virus-specific proteins and by protection of immunized 
mice from challenge with virulent VEE virus (Kinney et 

20 al., 1988a; Kinney et al . , 1988b). However, incorporation 
of this cDNA clone into an infectious cDNA clone of VEE 
virus completely .abrogated the virulence of the clone- 
derived virus, whereas the corrected cDNA fragment 
resulted in derivation of virulent virus (Kinney et al., 

25 1993) . 

Although Lai et al. (1991) originally derived their 
infectious clone of DEN- 4 virus from sequence 
characterized subunit cDNA clones (Zhao at el., 1986; 
Mackow et al . , 1987), the original full-length clone was 
30 not infectious (Lai et al . , 1991). While these 
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investigators indicated that -they sequenced both strands 
of much of the cloned genomic cDNA, they did not indicate 
that they sequenced more than a single clone for a given 
cDNA. region. Nucleotides encoding cDNA errors will be 
5 confirmed on both cDNA strands, but will not be identified 
as errors unless the sequences of two or more independent 
cDNA clones covering the same region of the genome are 
sequenced. The functional full-length clone of DEN -4 
virus was obtained by repeated splicing of large new cDNA 

10 fragments into the full-length clone until a functional 

clone was obtained. The authors did not indicate that the 
newly cloned regions were characterized by nucleotide 
sequence analysis (Lai et al . , 1991) . It is probable that 
the slowed replication of the DEN-4/1 and DEN-4/2 chimeric 

15 viruses relative to wild-type viruses reported by Bray et 
al . (1991.) is due to the presence of cDNA artifacts within 
the full-length cDNA clone. The critical importance of 
accurate nucleotide sequence characterization of genes 
expressed in vitro, particularly when those genes are 

20 expressed in the form of infectious cDNA clones, is still 
not widely appreciated by many in the molecular biology' 
field. 

Although putative nucleotide sequences for the 
genomes of DEN -2 16681 and DEN- 2 PDK-53 viruses have been 

25 reported in the literature (Blok et al., 1992), our ' 
sequence results indicate that the published data is 
highly flawed. Blok et al . (1992) reported 53 nucleotide 
mutations between the two viruses; we determined only 8 
mutations. .We analyzed at least two independent cDNA 

3 0 clones for regions covering the entire genomes of both 
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viruses. The DEN-16681 sequencing project was completed 
prior to receiving the DEN- 2 PDK-53 virus in our 
laboratory, and the nucleotide sequence of the PDK-53 
virus was determined from cDNA amplified directly from 
5 virus present in vaccine vials. ' 

There are now only two classes of infectious clones 
developed for vaccine' flaviviruses that have themselves 
been administered to humans: the infectious clone of 
yellow fever virus, vaccine strain 17D (Rice et al . , 1989; 
10 Hahn et al . , 1987; Rice et al . , 1985), and the DEN- 1 , DEN- 
2, DEN- 3 , and DEN -4 vaccine derivative infectious clones 
described herein. Both classes of infectious "clones " have 
the important . advantage of being derived from vaccine 
viruses that have been tested for efficacy and safety in 
15 ; . humans. The yellow fever 17D virus vaccine has long been 
one of the most effective human vaccines developed; 
immunization with this virus provides lifelong immunity . 
In the case of DEN virus, it is essential that vaccines 
provide immunity against infection by all four serotypes 
20 of the virus. DEN- 1 , DEN -2 , DEN- 3 , and DEN- 4 vaccine 
viruses have been developed at Mahidol University, 
Bangkok, Thailand. All' four vaccine viruses have been 
tested in humans and have been demonstrated to be 
immunogenic and safe for human adults . 
25 Replicating vaccines in the form of live, attenuated 

viruses offer distinct advantages in terms of immunogenic 
efficacy due to replicative amplification of -viral 
antigens (antigenic mass) in the vaccinees and replication 
in appropriate target tissues. .Inactivated or subunit 
30 antigens usually suffer from a lack of sufficient 
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antigenic mass and subsequent failure to stimulate an 
effective immune response. Expression of proteins in 
recombinant vaccinia virus, which replicates primarily at 
the site of inoculation, may provide protection against 
5 parenteral challenge with virulent virus, but may not 
protect against an aerosol challenge. This was 
demonstrated for VEE virus when it was shown .that 
recombinant vaccinia virus expressing the structural 
proteins ' of VEE virus protected mice - from intraperitoneal 
10 challenge, but not intranasal challenge, with virulent VEE 
virus (Kinney et al.,. 1988b). Immunization with the live, 
attenuated VEE TO 83 vaccine virus, on the other hand, 
provided immunity against both parenteral challenge 
(immunity provided by circulating serum IgG antibody) and 
15 intranasal challenge (mucosal, IgA-base immunity) with 

virulent VEE virus.' Furthermore, the level of immunity, 
as measured by titers of VEE virus-specific neutralizing 
antibody, were considerably higher in TC.-8 3 virus - 
immunized mice and horses (the natural' epidemic host: for 
20 VEE virus) than in animals immunized with recombinant 

vaccinia/VEE virus (Kinney et al . , 1988b; Bowen et al. , 
1992) . Similar results have been reported for 
vaccinia/influenza A virus recombinants in rodents (Smith 
et al., 1986) . Furthermore, a replicating vaccine virus 
25 provides the appropriate T-cell epitopes to stimulate 

cell -mediated immunity as well as. humoral immunity. . T- 
cell epitopes may be lacking in subunit vaccines. In 
short, vaccination with a safe live, attenuated vaccine 
virus provides the optimal immunization of a natural 
3 0 infection in terms of the type and level of immunity 



WO 96/40933 - PCT/US96/09209 

91 . 

elicited and the repertoire of viral antigens involved in 
generating the immune response. 

To use the DEN viruses described herein as vaccine 
candidates, it is necessary to rederive the viruses by 
5 transfection of a cell line, such as primary dog kidney., 
certified for human use under conditions of good 
laboratory practice and management to ensure the avoidance 
of potential adventitious agents that might be present in 
uncertified cell lines. Although the cDNA-derived viruses 

10 originate from candidate vaccine viruses that have 

undergone testing in humans., they require recertif ication 
by analysis for possible in. vitro phenotypic markers of 
attenuation and by safety testing in small animals- and 
probably nonhuman primates. All investigative studies 

15 involving the pathogenesis of DEN virus are hampered by 

the unavailability of a suitable animal model. Certain in- 
vitro characteristics, are apparently associated with 
attenuation of DEN viruses, but" the only definitive test 
is vaccine trial in human volunteers. Vaccine trails 

20 would presumably follow those of the original wild- type 
vaccine viruses developed at Mahidol University. The 
protocol includes titration of the individual vaccine 
virus candidates in adult human volunteers to determine 
the minimal inf ectious/immunogenic dose for each virus. 

25 This is followed by immunization trials with different 

bivalent and trivalent combinations of vaccine virus . The 
final test is the quadravalent vaccine composed of 
appropriate doses of all four vaccine viruses. If the 
preliminary trials are successful, larger trials are 

30 scheduled, and the vaccine viruses are tested in children, 
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"who are the primary target for vaccine delivery. 

We describe herein a preferred method to develop an 
infectious cDNA. clone for a flavivirus. Optimally, a 
wild-type vaccine virus serves as the template for the 
5 clone construction. Large cDNA fragments are amplified 

from the genomic mRNA by PCR using virus -specific primers . 
and directly cloned into a TA- vector- or into the MCS of a 
low-copy number plasmid following restriction of the 
amp 11 con cDNA. The low- copy pBRUC-13 9 vector contains the 
10 MCS of pUC19 to permit convenient cloning of cDNA using a 
variety of RENZ sites. Other low-copy plasmids are 
available. The bacteriophage T7 or SP-6 promoter is 
usually engineered into the 5 '-terminal mRNA- sense 
amplimer, and a unique RENZ site for linearization of the 
recombinant plasmid containing the full-length cDNA must 
be engineered into the 3 -terminal complementary 
(negative) - sense amplimer. Exhaustive nucleotide analysis 
of the cDNA clones is desirable. 
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APPENDIX A 



PRIMERS DESIGNED FOR DEN-2 CLONING /SEQUENCING 'PROJECT: 



SEQ. 
ID 



PRIMER 



MER/ SENSE SEQUENCE 



3 
4 
5 
6 
7 



9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 



pUC/M13-P5 
pUC/M13-P5B 
pUC/M13-P3 
pUC/M13-P3B 
D2-1-ECO.T7 75/+ 



D2-SMT71 



D2-1 
D2-28 
D2-134 
CD2-250 
D2-274 
CD2-378 
D2-528 
CD2-616 
D2-616 
CD2-618 
CD2-771 
D2-847 
D2-996 
CD2-996 



25/ + 
27/ + 
25/ + 
27/- 



77/ + 



24/+ 
34/ + 
28/ + 
26/- 
32/ + 
25/- 
25/ + 
26/- 
25/ + 
25/- 
25/- 
25/ + 
27/ + 
27/- 



5 ' -CCCAGTCACGACGTTGTAAAACGAC-3 ' 

5 ' -GGATGTGCTGCAAGGCGATTAAGTTGG-3 • 

5 ' -TGAGCGGATAACAATTTCACACAGG-3 1 

5 ' -GGCTTTACACTTTATGCTTCCGG'CTCG - 3 ' 

5 1 -GCGGATATTG/ GAATTC/TCTAGA/ 

AATTTAATACGACTCACTATA/ 
AGTTGTTAGTCTACGTGGACCGACAAAGACAG-3 1 

{5' -Fill /EcoRI /Xbal/T7 Promoter/ 
5' -end of DEN-2) 

5' - C CAGT / G AATT C / GAG CT C / ACG CGT / 
AAATTTAATACGACTCACTATA/ 
AGTTGTTAGTCTACGTGGACCGACAAAGACAG-3 1 

(5 1 -Fill/EcoRI/SstI/MluI/T7 Promoter/ 
5 '-end of DEN-2) 



-AGTTGTTAGTCTACGTGGACCGAC-3 1 
-GACAGATTCTTTGAGGGAGCTGAGCTCAACGTAG - 3 1 
-TCAATATGCTGAAACGCGAGAGAAACCG- 3 1 

- GGGATTGTTAGGAAACGAAGGAACGC- 3 ' 

- CCACCAACAGCAGGGATACTGAAAAGATGGGG - 3 ' 
-TGCAGATCTGCGTCTCCTATTCAAG-3 1 
-CGTGAACATGTGTACCCTCATGGCC-3 * 
-TTGCACCAACAGTCAATGTCTTCAGG-3 ' 

- ACCAGAAGACATAGATTGTTGGTGC- 3 • 
-GCACCAACAGTCTATGTCTTCTGGC-3 ' 
-ATGTTTCCAGGCCCCTTCTGATGAC-3 ' 
-GCAGCAATCCTGGCATACACCATAG- 3 ' 

- GGTTGACATAGTCTTAGAACATGGAAG - 3 ■ 

- CTTCCATGTTCTAAGACTATGTCAACC - 3 ' 
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23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 
49 



D2-1005 

D2-1141 

D2-1211 
CD2-1211 
CD2-1227 

D2-1261 

D2-1416 
CD2-1503 

D2-1510 
CD2-1510 

D2-1546 
CD2-1567 

D2-1777 
CD2-1777 

D2-1863 
CD2-1888 

D2-2047 
CD2-2047 

D2-2170 
CD2-2200 

D2-2308 
CD2-2504 
CD2-2622 

D2-2702 
CD2-2864 

D2-2992 
CD2-3105 



3 5/+ 5 ' -GTCTTAGAACATGGAAGTTGTGTGACGACGATGGC-3 

25/+ 5 1 -ACAACAGAATCTCGCTGCCCAACAC-3 ■ 

25/+ 5 • - GCAAACACTCCATGGTAGACAGAGG- 3 1 

2 5A 5 ' -CCTCTGTCTACCATGGAGTGTTTGC- 3 • 
27/- 5 1 -CCACATCCATTTCCCCATCCTCTGTCT-3 » 

3 0/+ 5 1 - GGAAAGGGAGGCATTGTGACCTGTGCTATG - 3 ' 
28/+ 5 ' - GGAAATCAAAATAACACCACAGAGTTCC - 3 ' 

34/- 5 1 - CTGCAGCAACACCATCTCATTGAAGTCGAGGCCC - 3 » 

25/+ 5 1 -GACTTCAATGAGATGGTGCTGCTGC- 3 » 

2 5/+ 5 1 -GCAGCAGCACCATCTCATTGAAGTC- 3 ' 
28/+ 5 ' -AAGCTTGGCTGGTGCACAGGCAATGGTT- 3 1 
27/- 5.' TGGTAACGGCAGGTCTAGGAACCATTG- 3 » 
23 / + 5 ' -GGACATCTCAAGTGCAGGCTGAG-3 1 

23 /+ 5 ' -CTCAGCCTGCACTTGAGATGTCC-3 ' 

27/+ 5 ' -GAAGGAAATAGCAGAAACACAACATGG- 3 • 

3 3/- 5 » - CCCTTCATATTGTACTCTGATAACTATTGTTCC- 3 ' 
32/+ 5 * - CCTCCATTCGGAGACAGCTACATCATCATAGG- 3 • 

3 2/- 5 ' - CCTATGATGATGTAGCTGTCTCCGAATGGAGG- 3 • 

29/+ 5 ' - ATGGCCATTTTAGGTGACACAGCCTGGGA- 3 ■ 

27/- 5 ' - TGTAAACACTCCTCCCAGGGATCCAAA- 3 • 

2 9/+ 5 1 - CT CATAG GAG T CAT TAT CACATGG ATAGG - 3 * 

3 5/- 5 ' - GGGG ATTCTGGTTGG AACTTATATTGTTCTGTCC - 3 1 

3 0/- 5 ' - TGATTCAATTCTGGTGTTATTTGTTTCCAC- 3 ' 

25/+ 5 ' - AAGGAATCATGCAGGCAGGAAAACG - 3 ' 

22/- 5 ' -ACTTCCAGCGAGTTCCAAGCTC-3 ' 

A A 

25/+ 5 ' - AACAG AG C CG T C CATG C CG AT ATGG - 3 1 

22/- 5 • - TCCATTG CTCCAAAGGGTGTGT - 3 ' 

G 



50 



D2-3236 



25/h 



5 ' - AGCTTGAGATGGACTTTGATTTCTG- 3 
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51 CD2-3410 

52 D2-3621 

53 CD2-3739 



54 
55 

56 
57 
58 
59 



66 
67 
68 
69 

70 
71 
72 
73 
74 
75 



D2-3905 
CD2-4002 

CD2-4060 
D2-4214 
D2-4257 

CD2-4323 



60 • D2-4497 

61 CD2-4557 

62 CD2-4615 

63 D2-4746 

64 D2-4792 

65 CD2-4922 



D2-4994 
D2-5124 
D2-5173 
CD2-5272 

CD2-S318 
CD2-5656 
CD2-5891 
D2-5770 
D2-6152 
CD2-6252 



22/- 
23/+ 
25/- 

25/+ 
25/- 

25/- 
25/+ 
34/+ 
24/- 

25/ + 
30/- 
25/- 

25/ + 
25/ + 
25/- 

25/+ 
25/ + 
25/ + 
19/- 

25/- 
27/- 
26/- 
25/ + 
25/ + 
27/- 



5' -GGTCTGATTTCCATCCCGTACC-3 ' 

5 ' -GTCCTTTAGAGACCTGGGAAGAG-3 ' 

5 ' -G^TTTCTCAAGAGTAGTCCAGCTGC- 3 ' ' 
C 

5 ' - ATCAATTGGCAGTGACTATCATGGC- 3 » . 

5 1 - TGTTAAGAGCAGTGGAGAAACGGAC- 3 ' 
A G 

5 1 -GATTGAGACCTTTGATCGTCAACGC-3 ' 

5 ■ - TGACAGGACCATTAGTGGCTGGAGG - 3 1 

5 1 - CGTGCTCACTGGACGATCGGCCGATTTGGAACTG- 3 1 

5 ' -GGGCTGCTTCCTGATAT1TCTGCC-3 ■ 

C 

5 ' - CCTGTGGGAAGTGAAGAAACAACGG -"3 1 

5 1 -GCTCCATCTTCCAGTTCAGCCTTTCCCATG- 3 * 

5 * -CTCCGGCTCC&ATCTGAGAGTATCC-3 ■ 
G ' G A 

5 » -CCTAATATCATATGGAGGAGGCTGG- 3 ' 

5 ' -GAAGGAGAAGAAGTCCAGGTATTGG- 3 ' 

5 1 - £TGTCGA£AATTGGAGATCCTGACG - 3 ■ 
T T 

5 ' - GTGGAGCATATGTGAGTGCTATAGC - 3 1 

5 ' -TCTGACTATGGCCGGAAGGTATCTC- 3 ' 

5 ' -ACATTAATCTTGGCCCCCACTAGAG-3 ' 

5 • -CGATCTCCCGCCCGGTGTG-3 ' 
A 

5 1 -CTAACTGGTGATAGCAGCCTCATGG-3 ' 

5 ' - CCTACTGAGTTGTATCACTTTCTTTCC - 3 *• 

5 ■ -TGGATTTCTTCCTATTCTCCCTCTTC- 3 ' 

5 * -TTCAAGGCTGAGAGGGTTATAGACC- 3 ' 

5 ' -TCTGGTTGGCCTACAGAGTGGCAGC- 3 ' 

5 1 -CCTTCTTTTGTCCAGATTTC£ACTTCC-3 ' 

A 
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76 D2-6493 35/+ 5 ' -GCGTACAACCATGCTCTCAGTGAACTGCCGGAGAC- 3 

77 CD2-6605 24/- 5 ' -TTCCCAGGGTCATCTTCCCTATAC-3 * 

G 

7 8 cD2 - 6624 31/- 5'- GATGCTAGCCGTGATTATGCAGCACATTCCC- 3 ■ 

79 D2-6748 25/+ 5 ' - AAACAGAGAACACCCCAAGACAACC- 3 ■ 

80 CD2-6932 • 21/- 5 • -CGGCATACAGCGTCCATGCTG- 3 ' 

81 D2-7055 25/+ 5 ' -GTCTCGGGAAAGGATGGCCATTGTC- 3 1 

82 CD2-7195 25/- 5 1 -CTCTGGTTGCTTTTGCTTGAAGTCC- 3 ' 

A G G 

83 CD2-7217 27/- 5' -CCGCCGCTGCTCTTTTCTGAGCTTCTC-3 1 

84 D2-7378 25/+ 5 1 - AGGACTACATGGGCTCTGTGTGAGG- 3 ' 

8 5 CD2-7515 19/- 5 1 -GAGAAGTCCAGCTCCGGCC- 3 1 

86 D2-7769 25/+ 5 ' - AGAGAAACATGGTCACACCAGAAGG- 3 • 

87 CD2-7885 22/- 5'- GTTCTTCGTGTCCTGGTCCTCC - 3 • 

88 D2-8165 25/+ 5'- GGAAATATGGAGG AGCCTAGTGAGG - 3 ■ 

89 CD2-8210 22/- 5 ■ - ACCCAGTACATCTCATGTGTGG- 3 ' 

90 D2-8428 28/+ 5 ' -GAGCATGAAACATCATGGCACTATGACC- 3 « 

91 D2-8440 25/+ 5 • -TCATGGCACTATGACCAAGACCACC- 3 1 

92 CD2-8529 22/- 5 1 - CAGXCTGA£CACTCCGTT£ACC- 3 ' 

C A G 

93 D2-8773 25/+ 5 « -AAGGTGAGAAGCAATGCAGCCTTGG- 3 ' 

94 D2 - 8 79 8 29 /+ 5 ' -GGGCCATATTCACTGATGAGAACAAGTGG- 3 » 

95 CD2-8865 22/- 5 * -TCTTTCCCTGTCAACCAGCTCC- 3 ■ 

C T 

36 D2-9046 ■ 25/+ 5 ' -AATGAAGATCACTGGTTCTCCAGAG- 3 1 

97 D2-9131 25/+ 5 ' - ACGTGAG CAAGAAAGAGGGAGG AG C - 3 1 

SB CD2-9166 22/- 5 1 - TGTCCCATCCTGCJGTGTCATC - 3 * 

A G 

99 CD2-9234 30/- 5 1 - G CTAGTTTCTTGTGTT CTCCTTCCATGTGG - 3 * 

100 D2- 9344 25/+ 5 ' - TCATATCGAGAAGAGACCAAAGAGG- 3 1 

101 . CD2-9429 24/- 5 ' -ACTCCTTCTCCCTCCATCTGTCTG- 3 1 
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102 cD2-9438 

103 CD2-9468 

104 D2-9645 

105 D2-9656.BAM 

106 CD2-9668 

107 CD2-9779 

108 CD2-9796 

109 CD2-9796.XBA 



110 
111 
112 

113 
114 
115 

116 
117 
118 
119 
120 



CD2-9913 
D2-9937 
CD2-9977 

CD2-10003 
D2-10203 
CD2-10261 

D2-10289 
CD2-10337 
D2-10418 
D2-10470 
CD2-10530 



121 CD2-10687 

122 CD2-10687.XBA 



27/- 

32/- 
25/ + 
45/ + 

28/- 
21/- 

28/- 
44/- 

26/- 
25/ + 
21/- 

25/-- 
25/ + 
21/- 

25/- 
23/- 
25/+ 
25/ + 
19/- 

59/- 
59/- 



123 CD2-10687.X2 



52/- 



5 ' -ATGCTTTTGAAGATTCCTTCTCCCTCC-3 1 
A ' C 

5 ' -GCACAGCGATTTCTTCTGTGATTGTTAGGTGC-3 • 

5 ■ - ACAATGGGAACCTTCAAGAGGATGG - 3 • 

5 1 - TTATGACATT / GGATCC / TTCAAGAGGATGGA 
ATGATTGGACACAAG-3 1 

{5 1 - Fill/BamHI/DEN- 2 Sequence) 

5 ' - CAGAAGGGCACTTGTGTCCAATCATTCC- 3 1 

5 ' - CTCCCTGGGAAATTCGGGCTC - 3 1 
T G 

5 ' -CCGTCTCCCGCAAAGACCACCCTGCTCC-3 ' 

5« - TTATCACCTA/TCTAGA/ CCGTCTCCC . 

GCAAAGACCACCCTGCTCC-3 ' 

5 1 - GTTGGAACCCAATGTGATGGTACTGC- 3 • 

5 ' -ACAAGTCGAACAACCTGGTCCATAC-3 ■ 

5 ' -GCATGTCTTCCGTCGTCATCC- 3 ' 

. T 

5 • -CTTGAATCCACACCCTGTTCCAGAC-3 1 

5 1 - ATACA CAG ATT ACATG C CAT C CATG - 3 ' 

5 > - TTTTG CCTTCTACCACAGGAC - 3 • 
T A 

5 ' -GAAACAAGGCTAGAAGTCAGGTCGG - 3 1 

5 • - GACGGGGCTCACAGGTAGCATAG - 3 1 

5 ' -GCCTGTAGCTCCACCTGAGAAGGTG- 3 ' 

5 ' -GGAAGCTGTACGCATGGCGTAGTGG- 3 ■ 

5 1 - GGGCCCCCGTTGTTGCTGC - 3 ' 
A 

5 ' -AGAACCTGTTGATTCAACAGCACCATTCCATTTTCTG-3 ■ 

5 ■ - TTAT CAC CT A / G CATG C / TCTAGA / 

AGAACCTGTTGATTCAACAGCACCATTCCATTTTCTG - 3 1 

(5 • -Fill/Sphl/Xbal/ 
3 1 -End DEN -2 Sequence) 

5 * - TTATCACCTA/TCTAGA/ 

GAACCTGTTGATTCAACAGCACCATTCCATTTTCTG- 3 ' 

(5 ' -Fill/Xbal/ 
3 1 -End DEN- 2 Sequence) 
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While particular embodiments of. the invention have 
been described in- detail, it will be apparent to those 
skilled in the art that these embodiments are exemplary 
rather than limiting, and the true scope of the 
invention is that defined within the attached claims. 
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SEQUENCE LISTING 



. ( 1 ) GENERAL INFORMATION 

(i) APPLICANT: MAHIDOL UNIVERSITY 
Bangkok, Thailand 

The United States of 

America, as represented by the Secretary, 
Department of Health and Human Services 
c/o Centers for Disease Control and 
Prevention 

Technology Transfer Office 
Mail Stop E-67. 
1600 Clifton Road 
Atlanta, Georgia 30333 

(ii) TITLE OF THE INVENTION: INFECTIOUS CDNA CLONES FOR DENGUE 2 

VIRUS ... 

(iii) NUMBER OF SEQUENCES: 137 

. (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE & ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY : Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP: 30303 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : U.S. Serial No. 08/483,292 

(B) FILING DATE: 7 Jun 1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER : 

(B) FILING DATE: 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Spratt, Gwendolyn D. 

(B) REGISTRATION NUMBER: 36,016 

(C) REFERENCE /DOCKET NUMBER: 14114. 0179/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404-688-0770 

(B) TELEFAX: 404-688-9880 

(C) TELEX: 
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. (2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10723 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) ~ MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME /KEY : Coding. Sequence 

(B) LOCATION: 97... 10269 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: 

AGTTGTTAGT CTACGTGGAC CGACAAAGAC AGATTCTTTG AGGGAGCTAA GCTCAACGTA 60 

GTTCTAACAG TTTTTTAATT AGAGAGCAGA TCTCTG ATG AAT AAC CAA CGG AAA 1 1 4 

Met Asn Asn Gin Arg Lys 
1 5 

AAG GCG AAA AAC ACG CCT TTC AAT ATG CTG AAA CGC GAG AGA AAC CGC 162 

Lys Ala Lys Asn Thr Pro Phe Asn Met Leu Lys Arg Glu Arg Asn Arg 
10 15 20 

GTG TCG ACT GTG CAA CAG CTG ACA AAG AGA TTC TCA CTT GGA ATG CTG 210 
Val Ser Thr Val Gin Gin Leu Thr Lys Arg Phe Ser Leu Gly Met Leu 
25 30 • 35 

CAG GGA CGA GGA . CCA TTA AAA CTG TTC ATG GCC CTG GTG GCG TTC CTT 258 
Gin Gly Arg Gly Pro Leu Lys Leu Phe Met Ala Leu Val Ala Phe Leu 
40 " 45 50 

CGT TTC CTA ACA ATC CCA CCA ACA GCA GGG ATA TTG AAG AGA TGG GGA 306 
Arg Phe Leu Thr lie Pro Pro Thr Ala Gly lie Leu Lys Arg Trp Gly 
55 60 65 70 

ACA ATT AAA AAA TCA AAA GCT ATT AAT GTT TTG AGA GGG TTC AGG AAA 354 
Thr lie Lys Lys Ser Lys Ala lie Asn Val Leu Arg Gly Phe Arg Lys 
75 80 85 

GAG ATT GGA AGG ATG CTG AAC ATC TTG AAT AGG AGA CGC AGA TCT GCA 402 
Glu lie Gly Arg Met Leu Asn He Leu Asn Arg Arg Arg Arg Ser Ala 
90 95 10.0 

GGC ATG ATC ATT ATG CTG ATT CCA ACA GTG ATG GCG TTC CAT TTA ACC 450 
Gly Met He He Met Leu He Pro Thr Val Met Ala Phe His Leu Thr 
105 110 115 

ACA CGT AAC GGA GAA CCA CAC ATG ATC GTC AGC AGA CAA GAG AAA GGG 498 
Thr Ara Asn Gly Glu Pro His Met He Val Ser Arg Gin Glu Lys Gly 
120 125 . 130 
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AAA AGT CTT CTG TTT AAA ACA GAG GAT 
Lys Ser Leu Leu Phe Lys Thr Glu Asp 
135 140 

ATG GCC ATG GAC CTT GGT GAA TTG TGT 
Met Ala Met Asp Leu Gly Glu Leu Cys 
155 



TGT CCC CTT CTC AGG CAG AAT GAG CCA 
Cys Pro Leu Leu Arg Gin Asn Glu Pro 
170 175 



GGC GTG AAC ATG TGT ACC CTC . 546 
Gly- Val Asn Met Cys Thr Leu 
145 150 

GAA GAC ACA ATC ACG TAC AAG 594 
Glii Asp Thr lie Thr Tyr Lys 
160 165 



GAA GAC ATA GAC TGT TGG TGC 642 
Glu Asp lie Asp Cys Trp Cys 
180 



AAC TCT ACG TCC ACG TGG GTA ACT TAT GGG ACG TGT ACC ACC ATG GGA 690 

Asn Ser Thr Ser Thr Trp Val Thr Tyr Gly Thr Cys Thr Thr Met Gly 
185 ■ 190 195 

GAA CAT AG A AG A GAA AAA AG A TCA GTG GCA CTC GTT CCA CAT GTG GGA 738 

Glu His Arg Arg Glu Lys Arg Ser Val Ala Leu Val Pro His Val Gly 

200 205 210 

ATG GGA CTG GAG ACA CGA ACT GAA ACA TGG ATG TCA TCA GAA GGG GCC 786 

Met Gly Leu Glu Thr Arg Thr Glu Thr Trp Met Ser Ser Glu Gly Ala 

215- 220 225 230 

TGG AAA CAT GTC CAG AGA ATT GAA ACT TGG ATC TTG AGA CAT CCA GGC 834 

Trp Lys His Val Gin Arg lie Glu Thr Trp lie Leu Arg His Pro Gly 
235 240 245 

TTC ACC ATG ATG GCA GCA ATC CTG GCA TAC ACC ATA GGA ACG ACA CAT 882 

Phe Thr Met Met Ala Ala lie Leu Ala .Tyr Thr He Gly Thr Thr His 
250 . 255 260 

TTC CAA AGA GCC CTG ATT TTC ATC TTA CTG ACA GCT GTC ACT CCT TCA 930 

Phe Gin Arg Ala Leu He Phe He Leu Leu Thr Ala Val Thr Pro Ser 
265 270 275 

ATG ACA ATG CGT TGC ATA GGA ATG TCA AAT AGA GAC TTT GTG GAA GGG 978 

Met Thr Met Arg Cys He Gly Met Ser Asn Arg Asp Phe Val Glu Gly 

280 " " 285 290 



GTT TCA GGA GGA AGC TGG GTT GAC ATA GTC TTA GAA CAT GGA AGC TGT 1026 

Val Ser Gly Gly Ser Trp Val Asp He Val Leu Glu His Gly Ser Cys 

295 300 " 305 310 

GTG ACG ACG ATG GCA AAA AAC AAA CCA ACA TTG GAT TTT GAA CTG ATA 1 074 

Val Thr Thr Met Ala Lys Asn Lys Pro Thr Leu Asp Phe Glu Leu He 

315 320 325 

AAA ACA GAA GCC AAA CAG CCT GCC ACC CTA AGG AAG TAC TGT ATA GAG 1 1 22 

Lys Thr Glu Ala Lys Gin Pro Ala Thr Leu Arg Lys Tyr Cys He Glu 

330 335 340 

GCA AAG CTA ACC AAC ACA ACA ACA GAA TCT CGC TGC CCA ACA CAA GGG 1 1 70 

Ala Lys Leu Thr Asn Thr Thr Thr Glu Ser Arg Cys Pro Thr Gin Gly 

345 350 355 
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GAA CCC* AGC CTA AAT GAA GAG CAG GAC AAA AGG TTC GTC TGC AAA CAC 1218 

Glu Pro Ser. Leu Asn Glu Glu Gin Asp Lys Arg phe Val Cys Lys His 

360 365 370 

TCC ATG GTA GAC AG A GGA TGG GGA AAT GGA TGT GGA CTA TTT GGA AAG 1266 

Ser Met Val Asp Arg Gly Trp- Gly Asn Gly Cys Gly Leu Phe Gly Lys 
375 380 385 390 

GGA GGC ATT GTG ACC TGT GCT ATG TTC AGA TGC AAA AAG AAC ATG GAA 1314 

Gly Gly He Val Thr Cys Ala Met Phe Arg Cys Lys Lys Asn Met Glu 
395 400 405 



GGA AAA GTT GTG CAA CCA GAA AAC TTG GAA TAC ACC ATT GTG ATA ACA 1362 
Gly Lys Val Val Gin Pro Glu Asn Leu Glu Tyr Thr He Val He Thr 
410 415 ' 420 

CCT CAC TCA GGG GAA GAG CAT GCA GTC GGA AAT GAC ACA GGA AAA CAT 1410 
Pro His Ser Gly Glu Glu His Ala Val Gly Asn Asp Thr Gly Lys His 
425 430 435. 

GGC AAG GAA ATC AAA ATA ACA CCA CAG AGT TCC ATC ACA GAA GCA GAA 1458 
Gly Lys Glu He Lys He Thr Pro . Gin Ser Ser. He Thr Glu Ala Glu 
440 445 450 

TTG ACA GGT TAT GGC ACT GTC ACA ATG GAG TGC TCT CCA AGA- ACG GGC 1506 
Leu Thr Gly Tyr Gly Thr Val Thr Met Glu Cys Ser Pro Arg Thr Gly 
455 460 465 470 

CTC GAC TTC AAT GAG ATG GTG TTG CTG CAG ATG GAA AAT AAA GCT TGG' 15 54 
Leu Asp Phe Asn Glu Met Val Leu Leu Gin Met Glu Asn Lys Ala Trp 
475 480 485 

CTG GTG CAC AGG CAA TGG TTC CTA GAC CTG CCG TTA CCA TGG TTG CCC 1602 
Leu Val His Arg Gin Trp Phe Leu Asp. Leu Pro Leu Pro Trp Leu Pro 
490 495 500 

GGA GCG GAC ACA CAA GGG TCA AAT TGG ATA CAG AAA GAG ACA TTG GTC 1650 
Gly Ala Asp Thr Gin Gly Ser Asn Trp lie Gin Lys Glu Thr Leu Val 
505 510 ^ 515 

ACT TTC AAA AAT CCC CAT GCG AAG AAA CAG GAT GTT GTT GTT TTA GGA 1698 
Thr Phe Lys Asn Pro His Ala Lys Lys Gin Asp Val Val Val Leu Gly 
520 525 530 

TCC CAA GAA GGG GCC ATG CAC ACA GCA CTT ACA GGG GCC ACA GAA ATC 1746 
Ser Gin Glu Gly Ala Met His Thr Ala Leu Thr Gly Ala Thr^ Glu He 
535 540 545 550 

CAA ATG TCA TCA GGA AAC TTA CTC TTC ACA GGA CAT CTC AAG TGC AGG 1794 
Gin Met Ser Ser Gly Asn Leu Leu Phe Thr Gly His Leu Lys Cys Arg 
555 560 565 

CTG AGA ATG GAC AAG CTA CAG CTC AAA GGA ATG TCA TAC TCT ATG TGC 1842 
Leu Arg Met Asp Lys Leu Gin Leu Lys Gly Met Ser Tyr Ser Met Cys 
570 575 580 

ACA GGA AAG TTT AAA GTT GTG AAG GAA ATA GCA GAA ACA CAA CAT GGA 1890 
Thr Gly Lys Phe Lys Val Val Lys Glu He Ala Glu Thr Gin His Gly 
585 590 595 
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ACA ATA GTT ATC AGA GTG" CAA TATGAa'gGG 'GAC GGC TCT CCA TGC AAG • 1938 

Thr lie Val lie Arg Val Gin Tyr Glu Gly Asp. Gly Ser Pro Cys Lys 

600 605 610 

ATC CCT TTT GAG ATA ATG GAT TTG GAA AAA AGA CAT GTC . TTA GGT CGC 1986 

lie Pro Phe Glu lie Met Asp Leu Glu Lys Arg His Val Leu Gly Arg 

615 620 625 630 

CTG ATT ACA GTC AAC CCA ATT GTG ACA GAA AAA GAT AGC CCA GTC AAC 2034 

Leu. lie Thr Val Asn Pro lie Val Thr Glu Lys Asp Ser Pro Val Asn 

635 640 645 

ATA GAA GCA GAA CCT CCA TTC GGA GAC AGC TAC ATC ATC ATA GGA GTA 2082 

He Glu Ala Glu Pro Pro Phe Gly Asp Ser Tyr He He He Gly Val 

650 655 660 

GAG CCG GGA CAA CTG AAG CTC AAC TGG TTT AAG AAA GGA AGT TCT ATC 2130 

Glu Pro Gly Gin Leu Lys Leu Asn Trp Phe Lys Lys Gly Ser Ser lie 

665 670 675 

GGC CAA ATG TTT GAG ACA ACA ATG AGG GGG GCG AAG AGA ATG GCC ATT 2178 

Gly Gin Met Phe Glu Thr Thr Met Arg Gly. Ala Lys Arg Met Ala He 

680 685 690 

TTA GGT GAC ACA. GCC TGG GAT TTT GGA TCC TTG GGA GGA GTG TTT ACA 2226 

Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser Leu Gly Gly Val Phe Thr 

695 ~ .700 \ 705 710 

TCT ATA GGA AAG GCT CTC CAC CAA GTC TTT GGA GCA ATC TAT GGA GCT 2274 

Ser lie Gly Lys Ala Leu His Gin Vat Phe Gly Ala He Tyr Gly Ala 

715 * 720 725 

GCC TTC AGT GGG GTT TCA TGG ACT ATG AAA ATC CTC ATA GGA GTC ATT 2322 

Ala Phe Ser Gly Val Ser Trp Thr Met Lys lie Leu lie Gly Val lie 

730 735 740 

ATC ACA TGG ATA GGA ATG AAT TCA ' CGC AGC ACC TCA CTG TCT GTG ACA 2370 

lie. Thr Trp lie Gly Met Asn Ser Arg Ser Thr Ser Leu Ser Val Thr 

745 750 " 755 

CTA GTA TTG GTG GGA ATT GTG ACA CTG TAT TTG GGA GTC ATG GTG CAG 2418 

Leu Val Leu Val Gly lie Val Thr Leu Tyr Leu Gly Val Met Val Gin 

760 * 765 770 

GCC GAT AGT GGT TGC GTT GTG AGC TGG AAA AAC AAA GAA CTG AAA TGT 2466 

Ala Asp Ser Gly Cys Val Val Ser Trp Lys Asn Lys Glu Leu Lys Cys 

775 " 780 785 790 

GGC AGT GGG ATT TTC ATC ACA GAC AAC GTG CAC ACA TGG ACA GAA CAA 2514 

Gly Ser Gly He Phe He Thr Asp Asn Val His Thr Trp Thr Glu Gin 

795 800 805 

TAC AAG TTC CAA CCA GAA TCC CCT TCA AAA CTA GCT TCA GCT ATC CAG 2562 

Tyr Lys Phe Gin Pro Glu Ser Pro Ser Lys Leu Ala Ser Ala He Gin 

810 815 820 

AAA GCC CAT GAA GAG GGC ATT TGT GGA ATC CGC TCA GTA ACA AGA CTG 2610 

Lys Ala His Glu Glu Gly lie Cys Gly He Arg Ser Val Thr Arg Leu 

825 830 835 
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GAG AAT CTG ATG TGG AAA CAA ATA ACA CCA GAA TTG AAT CAC ATT CTA otKa 

Glu Asn Leu Met Trp Lys Gin lie Thr Pro Glu 22 Asn Ss S LeS 2658 
840 845 850 

TCA GAA AAT GAG GTG AAG TTA ACT ATT ATG ACA GGA GAC ATC AAA rr a , 7 n C 

Ser Glu Asn Glu Val Lys Leu Thr lie Met Thr oif Asp III ■ J?i Gly 2 ° 6 

Sou 855 

uf 51? S2 SS r? A f** P A TCT CK ' C « CCT GAG CCC ACT GAG CTG 2754 
lie Met Gin Ala Gly Lys Arg Ser Leu Arg Pro Gin Pro Thr Glu Leu 

875 880 885 



AAG TAT TCA TGG AAA ACA TGG. GGC AAA GCA AAA ATG CTC TCT ACA CAP 
Lys Tyr Ser Trp Lys Thr Trp Gly Lys Ala Lys Met Leu IZ Thr Gil 2802 
oyu 895 900 



u^ T AAC £ AG ACC TTT CTC ATT GAT GGC CCC GAA ACA GCA GAA TGC 28Sf1 
Ser His Asn Gin Thr Phe Leu lie Asp Gly Pro Glu ThJ Ala §S ?ys 
'Y 3 910 915 

CCC AAC ACA AAT AGA GCT TGG AAT TCG TTG. GAA GTT GAA GAC TAT rrr Ofloa 
Pro Asn Thr Asn Arg Ala Trp Asn Ser Leu Glu Sal GlS tip g? c?y 8 



920 925 

TTT GGA GTA TTC ACC ACC AAT ATA TGG CTA AAA TTG AAA GAA AAA rar oqac 
Phe Gly val Phe Thr Thr Asn lie Trp Leu Lys leu tys SS JJJ Gin 2946 

940 945 950 

S£ u= A Sk C ^ GC GAC TCA ^ A CtC ATG TCA GCG GCC ATA AAA GAC AAC 299 4 
Asp Val Phe Cys Asp Ser Lys Leu Met Ser Ala Ala lie Lys Asp Asn 

AGA GCC GTC CAT GCC GAT ATG GGT TAT TGG ATA GAA AGT GCA CTC AAT ,ni? 
Arg- Ala Val His Ala Asp Met Gly. Tyr Trp lie Si Ser Ala LeS A^n ^ 
y/u 9 ? 5 980 

GAC ACA TGG AAG ATA GAG AAA GCC TCT TTC ATT GAA GTT AAA AAC TGC ™ort 
Asp Thr Trp Lys lie Glu Lys Ala Ser Phe He GlS Val £Js aUS C^s 

CAC TGG GCA AAA TCA CAC ACC . CTC TGG AGC AAT GGA GTG CTA GAA AGT m« 

IOoS ° ^ HiS ,VZi LSU Trp Ser Asn Lei Glu sS 38 

1UUU 1005 1010 

GAG ATG ATA ATT CCA AAG AAT CTC GCT GGA CCA GTG TCT CAA CAC AAC T1Rfi 
Glu Met He lie Pro Lys Asn Leu Ala Gly Pro Val Ser 3J Sis Kn 3 86 
015 1020 ' 1025 10 30 

TAT AGA CCA GGC TAC CAT ACA CAA ATA ACA GGA CCA TGG CAT CTA GGT VtA 
Tyr Arg Pro Gly Tyr His Thr Gin He Thr Gly Pro Trp His ilu Gly 34 
1035 1040 1045 

AAG CTT GAG ATG GAC TTT GAT TTC TGT GAT GGA ACA ACA GTG GTA GTG 3282 
Lys Leu Glu Met Asp Phe Asp Phe Cys Asp Gly Thr Thr Val Val Val 
1050 1055 1060 

Thl ri° GAC I GC GGA AAT AGA GGA CCC TCT TTG AGA ACA ACC ACT GCC 3330 
Thr Glu Asp Cys Gly Asn Arg Gly Pro Ser Leu Arg Thr Thr Thr Ala 
1065 1070 1075 
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TCT GGA AAA CTC ATA ACA GAA TGG TGC TGC CGA TCT TGC ACA TTA CCA 3378 

Ser Gly Lys Leu He Thr Glu Trp Cys Cys Arg Ser Cys Thr Leu Pro ' 
1080 1085 1090 

CCG CTA AG A TAC AG A GGT GAG GAT GGG TGC TGG TAC GGG ATG GAA ATC 3426 

Pro Leu Arg Tyr Arg Gly Glu Asp Gly Cys Trp Tyr Gly Met Glu He 

1095 ~ ' 1100 - 1105 '1110 

AGA CCA TTG AAG GAG AAA GAA GAG AAT TTG GTC AAC TCC TTG GTC ACA 3474 

Arg Pro Leu Lys Glu Lys Glu Glu Asn Leu Val Asn Ser . Leu Val Thr 
1115 1120 1125 

GCT GGA CAT GGG CAG GTC GAC AAC TTT TCA CTA GGA GTC TTG GGA ATG 3522 

Ala Gly His Gly Gin Val Asp Asn Phe Ser Leu Gly Val Leu Gly Met 

1130 1135 . 1140 

GCA TTG TTC CTG GAG GAA ATG CTT AGG ACC .CGA GTA GGA ACG AAA CAT 3570 

Ala Leu Phe Leu Glu Glu Met Leu Arg Thr Arg Val Gly' Thr Lys His 
1145 1150 1155 

GCA ATA CTA CTA GTT GCA GTT TCT TTT GTG ACA TTG ATC ACA GGG AAC 3618 

Ala lie Leu Leu Val Ala Val Ser Phe Val Thr Leu He Thr Gly Asn 
1160 1165 1170 

ATG TCC TTT AGA GAC CTG GGA AGA GTG ATG GTT ATG GTA GGC GCC ACT 3666 

Met Ser Phe Arg Asp Leu Gly -Arg Val Met Val Met Val Gly Ala Thr 
1175 " 1180 1185 1190 

ATG ACG GAT GAC ATA GGT ATG GGC GTG ACT TAT CTT GCC CTA CTA GCA 3714 

Met Thr Asp Asp He Gly Met Gly Val Thr Tyr Leu Ala Leu Leu Ala 
1195 1200 1205 

GCC TTC AAA GTC AGA CCA ACT TTT GCA GCT GGA CTA CTC TTG AGA AAG 3762 

Ala Phe Lys Val Arg Pro Thr Phe Ala Ala Gly Leu Leu Leu Arg Lys 
1210 1215 1220 

CTG ACC TCC AAG GAA TTG ATG ATG ACT ACT ATA GGA ATT GTA CTC CTC 3810 

Leu Thr Ser Lys Glu Leu Met Met Thr Thr He Gly He Val Leu Leu 
1225 1230 1235 

TCC CAG AGC ACC ATA CCA GAG ACC ATT CTT GAG TTG ACT GAT GCG TTA 3858 

Ser Gin Ser Thr He Pro Glu Thr lie Leu Glu Leu Thr Asp Ala Leu 
1240 1245 1250 

GCC TTA GGC ATG ATG GTC CTC AAA ATG GTG AGA AAT ATG GAA AAG TAT 3906 

Ala Leu Gly Met Met Val Leu Lys Met Val Arg Asn Met Glu Lys Tyr 
1255 1260 1265 1270 

CAA TTG GCA GTG ACT ATC ATG GCT ATC TTG TGC GTC CCA AAC GCA GTG 3954 

Gin Leu Ala Val Thr He Met Ala He Leu Cys Val Pro Asn Ala Val 
1275 1280 1285 

ATA TTA CAA AAC GCA TGG AAA GTG AGT TGC ACA ATA TTG GCA GTG GTG 4002 

He Leu Gin Asn Ala Trp Lys Val Ser Cys Thr lie Leu Ala Val Val 
1290 * 1295 1300 

TCC GTT TCC CCA CTG CTC TTA ACA TCC TCA CAG CAA AAA ACA GAT TGG 4050 

Ser Val Ser Pro Leu Leu Leu Thr Ser Ser Gin Gin Lys Thr Asp Trp 
1305 1310 1315 
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ATA CCA TTA GCA TTG ACG ATC AAA GGT CfC AAT CCA ACA GCT ATT TTT 4098 
lie Pro Leu Ala Leu -Thr. lie Lys Gly Leu Asn Pro Thr Ala lie Phe 
1320 1325 1330 

CTA ACA ACC CTC TCA AGA ACC AGC AAG AAA AGG AGC TGG CCA TTA AAT 4146 
Leu Thr Thr Leu Ser Arg Thr Ser Lys Lys Arg Ser Trp Pro Leu Asn 
1335 1340 1345 " 1350 

^GAG GCT ATC ATG GCA GTC GGG ATG GTG AGC ATT TTA GCC AGT TCT CTC ' 4194. 
Glu Ala lie Met Ala Val Gly Met Val Ser He Leu Ala Ser Ser Leu 
1355 1360 1365 

CTA AAA AAT GAT ATT CCC ATG ACA GGA CCA TTA GTG GCT GGA GGG CTC 4242 
Leu Lys Asn Asp He Pro Met Thr Gly Pro Leu Val Ala Gly Gly Leu 
1370 1375 1380 

CTC ACT GTG TGC TAC GTG CTC ACT GGA ' CGA TCG GCC GAT TTG GAA CTG 4290 
Leu Thr Val Cys Tyr Val Leu Thr Gly Arg Ser Ala Asp Leu Glu Leu 
1385 1390 ~ 1395 

GAG AGA GCA GCC GAT GTC AAA TGG GAA GAC CAG GCA GAG ATA TCA GGA 4338 
Glu Arg Ala Ala Asp Val Lys Trp Glu Asp Gin Ala Glu He Ser Gly 
1400 1405. 1410 

AGC AGT CCA ATC CTG TCA ATA ACA ATA TCA GAA GAT . GGT AGC ATG TCG 4386 
Ser Ser Pro lie Leu Ser He Thr lie Ser Glu Asp Gly Ser Met Ser 
1415 1420 1425 * 1430 

ATA AAA AAT GAA GAG GAA GAA CAA ACA CTG ACC ATA CTC ATT AGA ACA 4434 
He Lys Asn Glu Glu Glu Glu Gin Thr Leu Thr He Leu He Arg Thr 
1435 1440 1445 

GGA TTG CTG GTG ATC TCA GGA CTT TTT CCT GTA TCA ATA CCA ATC ACG 4482 
Gly Leu Leu Val He 1 Ser Gly Leu Phe Pro Val Ser lie Pro lie Thr 
1450 1455 1460 

GCA GCA GCA TGG TAC CTG TGG GAA GTG AAG AAA CAA CGG GCC GGA GTA 4530 
Ala Ala Ala Trp Tyr Leu Trp Glu Val Lys Lys Gin Arg Ala Gly Val 
1465 1470 1475. 

TTG TGG GAT GTT CCT TCA CCC CCA CCC ATG GGA AAG GCT GAA CTG GAA 4578 
Leu Trp Asp Val Pro Ser Pro Pro Pro Met Gly Lys Ala Glu Leu Glu 
1480 1485 1490 

GAT GGA GCC TAT AGA ATT AAG CAA AAA GGG ATT CTT . GGA TAT TCC CAG 4626 
Asp Gly Ala Tyr Arg He Lys Gin Lys Gly He Leu Gly Tyr Ser Gin 
1495 1500 1505 * 1510 

' ATC GGA GCC GGA GTT TAC AAA GAA GGA" ACA TTC CAT ACA ATG TGG CAT 4674 
He Gly Ala Gly Val Tyr Lys Glu Gly Thr Phe His Thr Met Trp His 
1515 1520 1525 

GTC ACA CGT GGC GCT GTT CTA ATG CAT AAA GGA AAG AGG ATT GAA CCA 4722 
Val Thr Arg Gly Ala Val Leu Met His Lys Gly Lys Arg lie Glu Pro 
1530 1535 1540 

TCA TGG GCG GAC GTC AAG AAA GAC CTA ATA TCA TAT GGA GGA GGC TGG 4770 
Ser Trp Ala Asp Val Lys Lys Asp Leu He Ser Tyr Gly Gly Gly Trp 
1545 1550 1555 
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AAG TTA GAA -GGA GAA TGG AAG GAA GGA GAA GAA GTC CAG GTA TTG GCA 4818 
Lys Leu Glu Gly Glu Trp Lys Glu Gly Glu Glu Val Gin Val -Leu Ala 
1560 1565 1570 

CTG GAG CCT GGA AAA AAT CCA AGA GCC GTC CAA ACG AAA CCT GGT CTT 4866 
Leu Glu Pro Gly Lys Asn Pro Arg Ala Val Gin Thr Lys Pro Gly Leu 
1575 1580 1 585 1590 

TTC AAA ACC AAC GCC GGA ACA ATA GGT GCT GTA TCT CTG GAC TTT TCT 4914 
Phe Lys Thr Asn Ala Gly Thr He Gly Ala Val Ser Leu Asp Phe Ser 
1595 1600 1605 

CCT GGA ACG TCA GGA TCT CCA ATT ATC GAC AAA AAA GGA AAA GTT GTG 4962 
Pro Gly Thr Ser Gly Ser Pro He He Asp Lys Lys Gly Lys Val Val 
1610 1615 1620 

GGT CTT TAT GGT AAT GGT - GTT GTT ACA AGG AGT GGA GCA TAT GTG AGT 5010 
Gly Leii Tyr Gly Asn Gly Val Val Thr Arg Ser Gly Ala Tyr Val Ser 
1625 * 1630 ~ 1635 

GCT ATA GCC CAG ACT GAA AAA AGC ATT GAA GAC AAC CCA GAG ATC GAA 5058 
Ala He Ala Gin Thr Glu Lys Ser He Glu Asp Asn Pro Glu He Glu 
1640 1645 .1650 

GAT GAC ATT TTC CGA AAG AGA AGA CTG ACC ATC ATG GAC CTC CAC CCA 5106 
Asp Asp He Phe Arg Lys Arg Arg Leu 'Thr lie Met Asp Leu .His Pro 
1655 1660 1665 1670 

GGA GCG GGA AAG ACG AAG AGA TAG CTT CCG GCC ATA GTC AGA GAA GCT 5154 
Gly Ala Gly Lys Thr Lys Arg Tyr Leu Pro Ala He Val Arg Glu Ala 
1675 1680 1685 

ATA AAA CGG GGT TTG AGA ACA TTA ATC TTG GCC CCC ACT AGA GTT GTG 5202 
lie Lys Arg Gly Leu Arg Thr Leu He Leu Ala Pro Thr Arg Val Val 
1690 1 695 1700 

GCA GCT GAA ATG GAG GAA GCC CTT AGA GGA CTT CCA ATA AGA TAC CAG 5250 
Ala Ala Glu Met Glu Glu Ala Leu Arg Gly Leu Pro He Arg Tyr Gin 
1705 1710 1715 

ACC CCA GCC ATC AGA GCT GAG CAC ACC GGG CGG GAG ATT GTG GAC CTA 5298 
Thr Pro Ala He Arg Ala Glu His Thr Gly Arg Glu He Val Asp Leu 
1720 1725 1730 

ATG TGT CAT GCC ACA TTT ACC ATG AGG CTG CTA TCA CCA GTT AGA GTG 5346 
Met Cys His Ala Thr Phe Thr Met Arg Leu Leu Ser Pro Val Arg Val 
1735 1740 1745 1750 

CCA AAC TAC AAC CTG ATT ATC 'ATG GAC GAA GCC CAT TTC ACA GAC CCA 5394 
Pro Asn Tyr Asn Leu He He Met Asp Glu Ala His Phe Thr Asp Pro 
1755 1760 1765 

GCA AGT ATA GCA GCT AGA GGA TAC ATC TCA ACT CGA GTG GAG ATG GGT 5442 
Ala Ser He Ala Ala Arg Gly Tyr lie Ser Thr Arg" Val Glu Met Gly 
1770 1775 " 1780 

GAG GCA GCT GGG ATT TTT ATG ACA GCC ACT CCC CCG GGA AGC AGA GAC 5490 
Glu Ala Ala Gly He Phe Met Thr Ala Thr Pro Pro Gly Ser Arg Asp 
1785 1790 1795 
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CCA TTT CCT CAG AGC AAT GCA CCA ATC ATA GAT GAA GAA AG A GAA ATC 5 5 38 
Pro Phe Pro Gin Ser Asn Ala Pro He He Asp Glu Glu .Arg Glu lie 
1800 1805 . 1810 

CCT GAA CGC TCG TGG AAT TCC GGA CAT GAA TGG GTC ACG GAT TTT AAA 5586 
Pro Glu Arg Ser Trp Asn Ser Gly His Glu Trp Val Thr Asp Phe Lys 
1815 1820 1825 1830 

GGG AAG ACT GTT TGG TTC GTT. CCA AGT ATA AAA GCA GGA AAT GAT ATA 5634 
Gly Lys Thr Val Trp Phe Val Pro Ser He Lys Ala Gly Asn Asp He 
1835 1840 1845 

GCA GCT TGC CTG AGG AAA AAT GGA AAG AAA GTG ATA CAA CTC AGT AGG . 5682 
Ala Ala Cys Leu Arg Lys Asn Gly Lys Lys Val lie Gin Leu Ser Arg 
1850 1855 1860 

AAG ACC TTT GAT TCT GAG TAT GTC AAG ACT AGA ACC AAT GAT TGG GAC 5730 
Lys Thr Phe Asp Ser Glu Tyr Val Lys Thr Arg Thr Asn Asp Trp Asp 
1865 1870 1875 

TTC GTG GTT ACA ACT GAC ATT TCA GAA ATG GGT GCC AAT TTC AAG GCT 5778 
Phe Val Val Thr Thr Asp He Ser Glu Met Gly Ala Asn Phe Lys Ala 
1880 1885 1890 

GAG AGG GTT ATA GAC CCC AGA CGC TGC ATG AAA CCA GTC ATA CTA ACA 5826 
Glu Arg Val He Asp Pro Arg Arg Cys Met Lys Pro Val lie Leu Thr 
1895 1900 1905 1910 

GAT GGT GAA GAG CGG GTG ATT CTG GCA GGA CCT ATG CCA GTG ACC CAC 5874 
Asp Gly Glu Glu Arg Val lie Leu Ala Gly Pro Met Pro Val Thr His 
1915 1920 .1925 

TCT AGT GCA GCA CAA AGA AGA GGG AGA ATA GGA AGA AAT CCA AAA AAT 5922 
Ser Ser Ala Ala Gin Arg Arg Gly Arg He Gly Arg Asn Pro Lys Asn 
1930 1935 " 1940 

GAG AAT GAC CAG TAC ATA TAC ATG GGG GAA CCT CTG GAA AAT GAT GAA 5970 
Glu Asn Asp Gin Tyr He Tyr Met Gly Glu Pro Leu Glu Asn Asp Glu 
1945 1950 1955 

GAC TGT GCA CAC TGG AAA GAA GCT AAA ATG CTC CTA GAT AAC ATC AAC 6018 
Asp Cys Ala His Trp Lys Glu Ala Lys Met Leu Leu Asp Asn lie Asn 
•1960 1965 1970 

ACG CCA GAA GGA ATC ATT CCT AGC ATG TTC GAA CCA GAG CGT GAA AAG 6066 
Thr Pro Glu Gly lie lie Pro Ser Met Phe Glu Pro Glu Arg Glu Lys 
1975 1980 1985 1990 

GTG GAT GCC ATT GAT GGC GAA TAC CGC TTG AGA GGA GAA GCA AGG AAA 6114 
Val Asp Ala lie Asp Gly Glu Tyr Arg Leu Arg Gly Glu Ala Arg Lys 
1995 ~ 2000 " 2005 

ACC TTT GTA GAC TTA ATG AGA AGA GGA GAC CTA CCA GTC TGG TTG GCC 6162 
Thr Phe Val Asp Leu Met Arg Arg Gly Asp Leu Pro Val Trp Leu Ala 
2010 2015 2020 

TAC AGA GTG GCA GCT GAA GGC ATC AAC TAC GCA GAC AGA AGG' 1 TGG TGT 6210 
Tyr Arg Val Ala Ala Glu Gly He Asn Tyr Ala Asp Arg Arg Trp Cys 
2025 2030 2035 
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TTT GAT GGA GTC AAG AAC AAC CAA ATC CTA GAA GAA AAC GTG GAA GTT 6258 
Phe Asp Gly Val Lys. Asn Asn Gin He Leu Glu Glu Asn Val Glu Val 
2040 2045 .2050 

GAA ATC TGG ACA AAA GAA GGG GAA AGG AAG AAA TTG AAA CCC AGA TGG 6306 
. Glu He Trp Thr Lys Glu Gly Glu Arg Lys Lys Leu Lys Pro Arg Trp 
2055 2060 2065 2070 

TTG GAT GCT AGG ATC TAT TCT GAC CCA CTG GCG CTA AAA GAA TTT AAG 6354 
Leu Asp Ala Arg He Tyr Ser Asp Pro Leu Ala Leu Lys Glu Phe Lys 
2075 2080 . 2085 

GAA TTT GCA GCC GGA AGA AAG TCT CTG ACC CTG AAC CTA ATC ACA GAA 6402 
Glu Phe Ala Ala Gly Arg Lys Ser Leu Thr Leu Asn Leu He Thr Glu 
2090 2095 2100 

ATG GGT AGG CTC CCA ACC TTC ATG ACT CAG AAG GCA AGA GAC GCA CTG 6450 
Met Gly Arg Leu Pro Thr Phe Met Thr Gin Lys Ala Arg Asp Ala Leu 
2105 2110 2115 

GAC AAC TTA GCA GTG CTG CAC ACG GCT GAG GCA GGT GGA AGG GCG TAC 6 498 
Asp Asn Leu Ala Val Leu His Thr Ala Glu Ala Gly Gly Arg Ala Tyr 
2120 2125 2130 

AAC CAT GCT CTC AGT GAA CTG CCG GAG ACC CTG GAG ACA TTG CTT TTA 6546 
Asn His Ala Leu Ser Glu Leu Pro Glu Thr Leu Glu. Thr Leu" Leu Leu 
2135 2140 2145 2150 

CTG ACA CTT CTG GCT ACA GTC ACG GGA GGG ATC TTT TTA TTC TTG ATG 6 594 
Leu Thr Leu Leu Ala Thr Val Thr Gly Gly He Phe Leu Phe Leu Met 
2155 2160 2165 

AGC GGA AGG.GGC ATA GGG AAG ATG ACC CTG GGA ATG TGC TGC ATA ATC 6 642 
Ser Gly Arg Gly He Gly Lys Met Thr Leu Gly Met Cys Cys lie lie 
2170 2175 2180 

ACG GCT AGC ATC CTC CTA TGG TAC GCA CAA ATA CAG CCA CAC TGG ATA 6690 
Thr Ala Ser He Leu Leu Trp Tyr Ala Gin lie Gin Pro His Trp lie 
2185 2190 2195 



2200 



2215 



TCA 


ATA 


ATA 


CTG 


GAG 


TT.T 


TTT 


CTC ATA 


GTT 


TTG 


CTT 


Ser 


He 


He 


Leu 


Glu 


Phe 


Phe 


Leu He 


Val 


Leu 


Leu 








2205 






2210 




GAA 


AAA 


CAG 


AGA 


ACA 


CCC 


CAA 


GAC AAC 


CAA 


CTG 


ACC 


Glu Lys Gin 


Arg 


Thr 


Pro 


Gin 


Asp Asn 


Gin 


Leu 


Thr 






2220 








2225 








GCC 


ATC 


CTC 


ACA 


GTG 


GTG 


GCC 


GCA ACC 


ATG 


GCA 


AAC 


Ala 


lie 


Leu 


Thr 


Val 


Val 


Ala 


Ala Thr 


Met 


Ala 


Asn 



2230 



2235 



2240 



2245 



GGT TTC CTA GAA AAA ACG AAG AAA GAT CTC GGA TTG GGA AGC ATT GCA 
Gly Phe Leu Glu Lys Thr Lys Lys Asp Leu Gly Leu Gly Ser He Ala 
2250 2255 2260 

£k° £ AG CAA CCC GAG AGC AAC ATC CTG GAC ATA GAT CTA CGT CCT GCA 
Thr Gin Gin Pro Glu Ser Asn He Leu Asp He Asp Leu Arg Pro Ala 
2265 2270 * 2275 



6738 



6786 



6834 



6882 



6930 
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TCA GCA TGG ACG CTG TAT GCC GTG GCC ACA ACA TTT GTT ACA CCA ATG 6978 
Ser Ala Trp Thr Leu Tyr Ala Val Ala Thr Thr Phe Val Thr Pro Met 
2280 2285 2290 

TTG AGA CAT AGC ATT GAA AAT TCC TCA GTG AAT GTG TCC CTA ACA GCT 7026 
Leu Arg His Ser lie Glu Asn Ser Ser Val Asn Val Ser Leu Thr Ala 
2295 2300 2305 2310 

ATA GCC AAC CAA GCC ACA GTG TTA ATG GGT CTC GGG- AAA GGA TGG CCA 7074 
He Ala Asn Gin Ala Thr Val Leu Met Gly Leu Gly Lys Gly Trp Pro 
2315 2320 2325 

TTG TCA AAG ATG GAC ATC GGA GTT CCC CTT CTC GCC ATT GGA TGC TAC 7122 
Leu Ser Lys Met Asp He Gly Val Pro Leu Leu Ala He Gly Cys Tyr 
2330 2335 2340 

TCA CAA GTC AAC CCC ATA ACT CTC ACA GCA GCT CTT TTC TTA TTG GTA 7170 
Ser Gin Val Asn Pro lie Thr Leu Thr Ala Ala Leu Phe Leu Leu Val 
2345 2350 2355 

GCA CAT TAT GCC ATC ATA GGG CCA GGA CTC CAA GCA AAA GCA ACC AGA 7218 
Ala His Tyr Ala He He Gly Pro Gly Leu Gin Ala Lys Ala Thr Arg 
2360 2365 2370 

GAA GCT CAG AAA AGA GCA GCG GCG GGC ATC ATG AAA AAC CCA ACT GTC 7266 
Glu Ala Gin Lys Arg Ala Ala Ala Gly He Met Lys Asn Pro Thr Val 
2375 2380 2385 2390 

GAT GGA ATA ACA GTG ATT GAC CTA GAT CCA ATA CCT TAT GAT CCA AAG 7314 
Asp Gly lie Thr Val He Asp Leu Asp Pro lie Pro Tyr Asp Pro Lys 
2395 2400 2405 

TTT GAA AAG CAG TTG GGA CAA GTA ATG CTC CTA GTC CTC TGC GTG ACT 7362 
Phe Glu Lys Gin Leu Gly Gin Val Met Leu Leu Val Leu Cys Val Thr 
2410 2415 2420 

CAA GTA TTG ATG ATG AGG ACT ACA TGG GCT CTG TGT GAG GCT TTA ACC 7410 
Gin Val Leu Met Met Arg Thr Thr Trp Ala Leu Cys Glu Ala Leu Thr 
2425 2430 2435 

TTA GCT ACC GGG CCC ATC TCC ACA TTG TGG GAA GGA AAT CCA GGG AGG 7 458 
Leu Ala Thr Gly. Pro lie Ser Thr Leu Trp Glu Gly Asn Pro Gly Arg 
2440 2445 2450 



TTT TGG AAC ACT ACC ATT GCG GTG TCA ATG GCT AAC ATT TTT AGA GGG 
Phe Tr^ Asn Thr Thr lie Ala Val Ser Met Ala Asn lie Phe Arg Gly 
2455 2460 2465 2470 



7506 



AGT TAC TTG GCC GGA GCT GGA CTT CTC TTT TCT ATT ATG AAG AAC ACA 
Ser Tyr Leu Ala Gly Ala Gly Leu Leu Phe. Ser lie Met Lys Asn Thr 
2475 2480 2485 



7554 



ACC AAC ACA AGA AGG GGA ACT GGC AAC ATA GGA GAG ACG CTT GGA GAG 7602 
Thr Asn Thr Arg Arg Gly Thr Gly Asn lie Gly Glu Thr Leu Gly Glu 
2490 2495 2500 

AAA TGG AAA AGC CGA TTG AAC GCA TTG GGA AAA AGT GAA TTC CAG ATC 7650 
Lys Trp Lys Ser Arg Leu Asn Ala Leu Gly Lys Ser Glu Phe Gin lie 
2505 ~ 2510 2515 
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TAC AAG AAA AGT GGA ATC CAG GAA GTG GAT AGA ACC TTA GCA AAA GAA 7698 

Tyr Lys Lys Ser Gly lie Gin Glu Val Asp Arg Thr Leu Ala Lys Glu 
2520 2525 * . 2530 

GGC ATT AAA AGA GGA GAA ACG GAC CAT CAC GCT GTG TCG CGA GGC TCA 7746 
Gly He Lys Arg Gly Glu Thr Asp- His His Ala Val Ser Arg' Gly Ser 
2535 2540 2545 2550 

GCA AAA CTG AGA TGG TTC GTT GAG AGA AAC ATG GTC AC A CCA GAA GGG 7794 
Ala Lys Leu Arg Trp Phe Val Glu Arg Asn Met Val Thr Pro Glu Gly- 
2555 2560 2565 

AAA GTA GTG GAC CTC GGT TGT GGC AGA GGA GGC TGG TCA TAC TAT TGT 7842 
Lys Val Val Asp Leu Gly Cys Gly Arg Gly Gly Trp Ser Tyr Tyr Cys 
2570 2575 2580 

GGA GGA CTA AAG AAT GTA AGA GAA GTC AAA GGC CTA AC A . AAA GGA GGA 7890 
Gly Gly Leu Lys Asn Val Arg Glu Val Lys Gly Leu Thr Lys Gly Gly' 
2585 2590 2595 

CCA GGA CAC GAA GAA CCC ATC CCC ATG TCA ACA TAT GGG TGG AAT CTA 7938 
Pro Gly His Glu Glu Pro He Pro Met Ser Thr Tyr Gly Trp Asn Leu 
2600 2605 261 0 

GTG CGT CTT CAA AGT GGA GTT GAC GTT TTC TTC ATC CCG CCA GAA AAG 7986 
Val Arg Leu Gin Ser Gly Val Asp Val Phe Phe He Pro Pro Glu Lys 
2615 2620 2625 2630 

TGT GAC ACA TTA TTG TGT GAC ATA GGG GAG TCA TCA CCA AAT CCC ACA 8034 
Cys Asp Thr Leu Leu Cys Asp He Gly Glu Ser Ser Pro Asn Pro Thr 
2635 2640 2645 

GTG GAA GCA GGA CGA ACA CTC AGA GTC CTT AAC TTA GTA GAA AAT TGG 8082 
Val Glu Ala Gly Arg Thr Leu Arg Val Leu Asn Leu Val Glu Asn Trp 
2650 2655 2660 

TTG AAC AAC AAC ACT CAA TTT TGC ATA AAG GTT CTC AAC CCA TAT ATG 8130 
Leu Asn Asn Asn Thr Gin Phe Cys He Lys Val Leu Asn Pro Tyr Met 
2665 .2670 2675 

CCC TCA GTC ATA GAA AAA ATG GAA GCA CTA CAA AGG AAA TAT GGA GGA 8178 
Pro Ser Val lie Glu Lys Met Glu Ala Leu Gin Arg Lys Tyr Gly Gly 
2680 2685 2690 

GCC TTA GTG AGG AAT CCA CTC TCA CGA AAC TCC ACA CAT GAG ATG TAC 8226 
Ala Leu Val Arg Asn Pro Leu Ser Arg Asn Ser Thr His Glu Met Tyr 
2695 2700 2705 2710 

TGG GTA TCC AAT GCT TCC GGG AAC ATA GTG TCA TCA GTG AAC ATG ATT 8274 
Trp Val Ser Asn Ala Ser Gly Asn lie Val Ser Ser Val Asn Met He 
2715 2720 2725 

TCA AGG ATG TTG ATC AAC AGA TTT ACA ATG AGA TAC AAG AAA GCC ACT 8322 
Ser Arg Met Leu He Asn Arg Phe Thr Met Arg Tyr Lys Lys Ala Thr • 
2730 2735 2740 

TAC GAG CCG GAT GTT GAC CTC GGA AGC GGA ACC CGT AAC ATC GGG ATT 8370 
Tyr Glu Pro Asp Val Asp Leu Gly Ser Gly Thr Arg Asn He Gly lie 
2745 ~ ~ 2750 2755 
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GAA AGT GAG ATA CCA AAC CTA GAT ATA ATT GGG AAA AGA ATA GAA AAA 8418 
Glu Ser Glu lie Pro Asn Leu . Asp lie lie Gly Lys Arg lie Glu Lys 
2760 2765 ' 2770 

ATA AAG CAA GAG CAT GAA AC A TCA TGG CAC TAT GAC CAA GAC CAC CCA 8466 
lie Lys Gin Glu His Glu Thr Ser Trp His Tyr Asp Gin Asp His Pro 
2775 2780 2785 . 2790 

TAC AAA ACG" TGG GCA TAC CAT GGT AGC TAT GAA AGA AAA CAG ACT GGA 8514 
Tyr Lys Thr Trp Ala Tyr His Gly Ser Tyr Glu Thr Lys Gin Thr Gly 
2795 2800 2805 

TCA GCA TCA TCC ATG GTC AAC GGA GTG GTC AGG CTG CTG AC A AAA CCT 8562 
Ser Ala Ser Ser Met Val Asn Gly Val Val Arg Leu Leu Thr Lys Pro 
2810 2815 " 2820 

TGG GAC GTC GTC CCC ATG GTG AC A CAG ATG GCA ATG" ACA GAC ACG ACT 8610 
. Trp Asp Val Val Pro Met. Val Thr Gin Met Ala Met Thr Asp Thr Thr 
2825 2830 2835 

CCA TTT GGA CAA CAG CGC GTT TTT AAA GAG AAA GTG GAC ACG AGA ACC 8658 
Pro Phe Gly Gin Gin Arg Val Phe Lys Glu Lys Val .Asp Thr Arg Thr 
2840 2845 . 2850 

CAA GAA CCG AAA GAA GGC ACG AAG AAA CTA ATG AAA ATA ACA GCA GAG 8706 
Gin Glu Pro Lys Glu Gly Thr Lys Lys Leu Met Lys He Thr Ala Glu 
2855 2860 2865 2870 

TGG CTT TGG AAA GAA TTA GGG AAG_ AAA AAG ACA CCC AGG ATG TGC ACC 8754 
Trp Leu Trp Lys Glu Leu Gly Lys Lys Lys 'Thr Pro Arg Met Cys Thr 
2875 2880 2885 

AGA GAA GAA TTC ACA AGA AAG GTG AGA AGC AAT GCA GCC TTG GGG GCC 8802 
Arg Glu Glu Phe Thr Arg Lys Val Arg Ser Asn Ala Ala Leu Gly Ala 
2890 2895 2900 

ATA TTC ACT GAT GAG AAC AAG TGG AAG TCG GCA CGT GAG GCT GTT GAA 8850 
He. Phe Thr Asp Glu Asn Lys Trp Lys Ser Ala Arg Glu Ala Val Glu 
2905 2910 2915 

GAT AGT AGG TTT TGG GAG CTG GTT GAC AAG GAA AGG AAT CTC CAT CTT 8898 
Asp Ser Arg Phe Trp Glu Leu Val Asp Lys Glu Arg Asn Leu His Leu 
2920 2925 * * 2930 

GAA GGA AAG TGT GAA ACA TGT GTG TAC AAC ATG ATG GGA AAA AGA GAG 8946 
Glu Gly Lys Cys Glu Thr Cys Val Tyr Asn Met Met Gly Lys Arg Glu 
2935 2940 2945 '; * * 2950 

AAG AAG CTA GGG GAA TTC GGC AAG GCA AAA GGC AGC AGA GCC V ATA TGG 899 4 
Lys Lys Leu Gly Glu Phe Gly Lys Ala Lys Gly Ser Arg Ala He Trp 
2955 2960 2965 

TAC ATG TGG CTT GGA GCA CGC TTC TTA GAG TTT GAA GCC CTA GGA TTC 9042 
Tyr Met Trp Leu Gly Ala Arg Phe Leu Glu Phe Glu Ala Leu Gly Phe 
2970 2975 2980 

TTA AAT GAA GAT CAC TGG TTC TCC AGA GAG AAC TCC CTG AGT GGA GTG 9 090 
Leu Asn Glu Asp His Trp Phe Ser Arg Glu Asn Ser Leu Ser Gly Val 
2985 2990 2995 
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GAA GGA GAA GGG CTG CAC AAG CTA GGT TAG ATT CTA AGA GAC GTG AGC 9138 
Giu Gly Glu Gly Leu His Lys Leu Gly Tyr lie Leu Arg Asp Val Ser - 
3000 3005 3010 

AAG AAA GAG GGA GGA GCA ATG TAT GCC GAT GAC ACC GCA GGA TGG GAT 9186 
Lys Lys Glu Gly Gly Ala Met Tyr Ala Asp Asp Thr Ala Gly Trp Asp 
3015 3020 3025 3030 

AGA AGA ATC AC A CTA GAA GAC KKA AAA AAT GAA GAA ATG GTA ACA AAC 9234 
Thr Arg lie Thr Leu Glu Asp Xaa Lys Asn Glu Glu Met Val Thr Asn 
3035 ~ 3040 3045 

CAC ATG GAA GGA GAA CAC AAG AAA CTA GCC GAG GCC ATT TTC AAA CTA 9282 
His Met Glu Gly Glu His Lys Lys Leu Ala Glu Ala He Phe Lys Leu 
3.050 * 3055 3060 

ACG TAC CAA AAC AAG GTG GTG CGT GTG CAA AGA CCA ACA CCA AGA GGC 9330 
Thr Tyr Gin Asn Lys Val Val Arg Val Gin Arg Pro Thr Pro Arg Gly 
3065 3070 3075 

ACA GTA ATG GAC ATC ATA TCG AGA AGA GAC CAA AGA GGT AGT GGA CAA 9378 
Thr Val Met Asp He He Ser Arg Arg Asp Gin Arg Gly Ser Gly Gin 
3080 3085 3090 

GTT GGC ACC TAT GGA CTC AAT ACT TTC ACC AAT ATG GAA GCC CAA- CTA 9426 
Val Gly Thr Tyr Gly Leu Asn Thr Phe Thr Asn Met Glu Ala Gin Leu 
3095 3100 3105 3110 

ATC AGA CAG ATG GAG GGA GAA GGA GTC TTT AAA AGC ATT CAG CAC CTA 9474 
He Arg Gin Met Glu Gly Glu Gly Val Phe Lys Ser He Gin His Leu 
3115 3120 3125 

ACA ATC ACA GAA GAA ATC GCT GTG CAA AAC TGG TTA' GCA AGA GTG GGG 9522 
Thr He Thr Glu Glu He Ala Val Gin Asn Trp. Leu Ala Arg Val Gly 
3130 3135 * 3140 

CGC GAA AGG TTA TCA AGA ATG GCC ATC AGT GGA GAT GAT TGT GTT GTG 9570 
Arg Glu Arg Leu Ser Arg Met Ala He Ser Gly Asp Asp Cys Val Val 
3145 3150 3155 

AAA CCT TTA GAT GAC AGG TTC GCA AGC GCT TTA ACA GCT CTA AAT GAC 9618 
Lys Pro Leu Asp Asp Arg Phe Ala Ser Ala Leu Thr Ala Leu Asn Asp 
3160 3165 3170 

ATG GGA AAG ATT AGG • AAA GAC ATA CAA CAA TGG GAA CCT TCA AGA GGA 9 666 
Met Gly Lys lie Arg Lys Asp He Gin Gin "Trp Glu Pro Ser Arg Gly 
3175 3180 " 3185 3190 

TGG AAT GAT TGG ACA CAA GTG CCC TTC TGT TCA CAC CAT TTC CAT GAG 9714 
Trp Asn Asp Trp Thr Gin Val Pro Phe Cys Ser His His Phe His Glu 
3195 3200 3205 

TTA ATC ATG AAA GAC GGT CGC GTA CTC GTT GTT CCA TGT AGA AAC CAA 97 62 
Leu lie Met Lys Asp Gly Arg Val Leu Val Val Pro Cys Arg Asn Gin 
3210 3215 3220 

GAT GAA CTG ATT GGC AGA GCC CGA ATC TCC CAA GGA GCA GGG TGG TCT 9810 
Asp Glu Leu He Gly Arg Ala Arg He Ser Gin Gly Ala Gly Trp Ser 
3225 3230 3235 
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TTG CGG GAG ACG GCC TGT TTG GGG A AG TCT TAG GCC CAA ATG TGG AGC 9858 
Leu Arg Glu Thr Ala Cys Leu Gly Lys Ser Tyr Ala Gin Met Trp Ser 
. 3240 ■ 3245 3250 • 

TTG ATG TAC TTC CAC AGA CGC GAC CTC AGG CTG GCG GCA A AT GCT ATT 9906 
Leu Met Tyr Phe His Arg Arg Asp Leu Arg Leu Ala Ala Asn Ala He 
3255 3260 3265 3270 

TGCTCG GCA GTA GCA TCA CAT TGG GTT CCA ACA AGT CGA AC A ACC TGG 9954 
Cys Ser Ala Val Pro Ser His Trp Val Pro Thr Ser Arg Thr Thr Trp 
3275 3280 " 3285 

TCC ATA CAT GCT AAA CAT GAA TGG ATG ACA ACG GAA GAC ATG CTG ACA 10002 
Ser He Hxs Ala Lys His Glu Trp Met . Thr Thr Glu Asp Met Leu Thr 
3290 3295 3300 

GTC TGG AAC AGG GTG TGG ATT CAA GAA AAC CCA TGG ATG GAA GAC AAA 10050 
Val Trp Asn Arg Val Trp He Gin Glu Asn Pro Trp Met Glu Asp Lys 
3305 3310 3315 

ACT CCA GTG GAA TCA TGG GAG GAA ATC CCA TAC TTG GGG AAA AGA GAA 10098 
Thr Pro Val Glu Ser Trp Glu Glu He Pro Tyr Leu Gly Lys Aro Glu 
3320 3325 3330 

GAC CAA TGG TGC GGC TCA TTG ATT GGG TTA ACA AGC AGG GCC ACC TGG 10146 
Asp Gin Trp Cys Gly Ser Leu He Gly Leu Thr Ser Arg Ala Thr Trp 
33 35 3340 * 3345 3350 

GCA AAG AAC ATC CAA GCA GCA ATA AAT CAA GTT AGA TCC CTT ATA GGC 10194 
Ala Lys Asn He Gin Ala Ala He Asn Gin Val Arg Ser Leu He Gly 
3355 3360 " 3365 

AAT GAA GAA TAC ACA GAT TAC ATG CCA TCC ATG AAA AGA TTC .AGA AGA 10242 
Asn Glu Glu Tyr Thr Asp Tyr Met Pro Ser Met Lys Arg Phe Arg Arq 
3370 3375 3380 



GAA GAG GAA GAA GCA GGA GTT CTG 
Glu Glu Glu Glu Ala Gly Val Leu 

3385 3390 
CTAGAAGTCA GGTCGGATTA AGCCATAGTA 
TCCAAGGACG TTAAAAGAAG TCAGGCCATC 
GCCTGTAGCT CCACCTGAGA AGGTGTAAAA 
TACGCATGGC GTAGTGGACT AGCGGTTAGA 
AATGGGGGCC CAAGGCGAGA TGAAGCTGTA 
GACCCCCCCG AAACAAAAAA CAGCATATTG 
TCCTCAGCAT CATTCCAGGC ACAGAACGCC 
GGTTCT 



TGG TAGAAAGCAA AACTAACATG AAACAAGG 10297 
Trp 



CGGAAAAAAC 
ATAAATGCCA 
AATCCGGGAG 
GAGGACCCCT 
GTCTCGCTGG 
ACGCTGGGAA 
AGAAAATGGA 



TATGCTACCT 
TAGCTTGAGT 
GCCACAAACC 
CCCTTACAAA 
AAGGACTAGA 
AGACCAGAGA 
ATGGTGCTGT 



GTGAGCCCCG 
AAACTATGCA 
ATGGAAGCTG 
TCGCAGCAAC 
GGTTAGAGGA 
TCCTGCTGTC 
TGAATCAACA 



1 0357 
10417 
10477 
10537 
1 0597 
10657 
i0717 
10723 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 10723 base pairs 

(B) TYPE : nucleic acid 

. (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
( ii i ) HYPOTHETICAL : • NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(ix) FEATURE: 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: .97. . .1 0269 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:. 

AGTTGTTAGT CTACGTGGAC CGACAAAGAC AGATTCTTTG AGGGAGCTAA GCTCAATGTA 60 
GTTCTAACAG TTTTTTAATT AGAGAGCAGA TCTCTG ATG AAT AAC CAA CGG AAA 1 1 4 

Met Asn Asn Gin Arg Lys 
1 5 

AAG GCG .AAA AAC ACG CCT TTC AAT ATG CTG AAA CGC GAG AGA AAC CGC 1 62 

Lys Ala Lys Asn Thr Pro Phe Asn Met Leu Lys Arg Glu Arg Asn Arg 
10 15 20 

GTG TCG ACT GTG CAA CAG CTG ACA AAG AGA TTC TCA CTT GGA ATG CTG 210 
Val Ser Thr Val Gin Gin Leu Thr Lys Arg Phe Ser Leu Gly Met Leu 
25 30 " r 35 

CAG GGA CGA GGA CCA TTA AAA CTG TTC ATG GCC CTG GTG GCG TTC CTT 258 
Gin Gly Arg Gly Pro Leu Lys Leu Phe Met Ala Leu Val Ala Phe Leu 
40 45 50 

CGT TTC* CTA ACA ATC CCA CCA ACA GCA GGG ATA TTG AAG AGA TGG GGA 306 
Arg Phe Leu Thr lie Pro Pro Thr Ala Gly lie Leu Lys Arg Trp Gly 
55 60 65 70 

ACA ATT AAA AAA TCA AAA GCT ATT AAT GTT TTG AGA GGG TTC AGG AAA 354 
Thr lie Lys Lys Ser Lys Ala lie Asn Val Leu Arg Gly Phe Arg Lys 
75 80 85 

GAG ATT GGA' AGG ATG CTG AAC ATC TTG AAT AGG AGA CGC AGA TCT GCA 402 
Glu lie Gly Arg Met Leu Asn lie Leu Asn Arg Arg Arg Arg Ser Ala 
90 95 100 

GGC ATG ATC ATT ATG CTG ATT CCA ACA GTG ATG GCG TTC CAT TTA ACC 450 
Gly Met lie He Met Leu He Pro Thr Val Met Ala Phe His Leu Thr 
105 110 115 

ACA CGT AAC GGA GAA CCA CAC ATG ATC GTC AGC AGA CAA GAG AAA GGG 498 
Thr Arg Asn Gly Glu Pro His Met He Val Ser Arg Gin Glu Lys Gly 
120 125 130 

AAA AGT CTT CTG TTT AAA ACA GAG GTT GGC GTG AAC ATG TGT ACC CTC 5 46 

Lys Ser Leu Leu Phe Lys Thr Glu Val Gly Val Asn Met Cys Thr Leu 
135 140 145 . 150 

ATG GCC ATG GAC CTT GGT GAA TTG TGT GAA GAC ACA ATC ACG TAC AAG 594 
Met Ala Met Asp Leu Gly Glu Leu Cys Glu Asp Thr He Thr Tyr Lys 
155 160 165 

TGT CCC CTT CTC AGG CAG AAT GAG CCA GAA GAC ATA GAC TGT TGG TGC 642 
Cys Pro Leu Leu Arg Gin Asn Glu Pro Glu Asp He Asp Cys Trp Cys 
170 175 180 

NAC TCT ACG TCC ACG TGG GTA ACT TAT GGG ACG TGT ACC ACC ATG GGA 690 
Xaa Ser Thr Ser Thr Trp Val Thr Tyr Gly Thr Cys Thr Thr Met Gly 
185 190 195 
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GAA CAT AGA AGA GAA AAA AGA TCA GTG GCA CTC GTT CCA ' CAT GTG GGA 7"?fi 
Glu His Arg Arg Gl.u Lys Arg Ser Val Ala Leu Val Pro His Val Glv 
200 205 210 - 

ATG GGA CTG GAG ACA CGA ACT GAA ACA TGG ATG TCA TCA GAA GGG GCC 786 
Met Gly Leu Glu Thr Arg Thr Glu Thr Trp Met Ser Ser Glu Gly Ala 
215 220 225 230 



TGG AAA CAT GTC CAG AGA ATT GAA ACT TGG ATC TTG AGA CAT CCA GGC 
Trp Lys His Val Gin Arg He Glu Thr Trp lie Leu Arg His Pro Gly 
23 5 240 245 

TTC. ACC ATG ATG GCA GCA ATC CTG GCA TAC ACC ATA GGA ACG ACA CAT 
Phe Thr Met Met Ala Ala He Leu Ala Tyr Thr He Gly Thr Thr His 
250 255 260 

TTC CAA AGA GCC CTG ATT TTC ATC TTA CTG ACA GCT GTC ACT CCT TCA 
Phe Gin Arg Ala Leu He Phe He Leu Leu Thr Ala Val Thr Pro Ser 
265 270 275 

ATG ACA ATG CGT TGC ATA GGA ATG TCA AAT AGA GAC TTT GTG GAA GGG 
Met Thr Met Arg Cys He Gly Met Ser Asn Arg Asp Phe Val Glu Gly 
280 285 290 . 

GTT TCA GGA GGA AGC TGG GTT GAC ATA GTC TTA GAA CAT GGA AGC TGT 
Val Ser Gly Gly Ser Trp Val Asp He Val Leu Glu His Gly Ser Cvs 
295 300_ 305 310 

GTG ACG ACG ATG GCA AAA AAC AAA CCA ACA TTG GAT TTT GAA CTG ATA 
Val Thr Thr Met Ala Lys Asn Lys Pro Thr Leu Asp Phe Glu Leu He 
315 320 325 

££ A GAA GCC AAA CAG CCT GCC ACC CTA AGG AAG TAC TGT ATA GAG 
Lys Thr Glu Ala Lys Gin Pro Ala Thr Leu Arg Lys Tyr Cys lie Glu 
330 335 340 

GCA. AAG CTA ACC NAC ACA ACA ACA GAA TCT CGC TGC CCA ACA CAA GGG 
Ala Lys Leu Thr Xaa Thr Thr Thr Glu Ser Arg Cys Pro Thr Gin Gly 
345 350 355 

GAA CCC. AGC CTA AAT GAA GAG CAG GAC AAA AGG TTC GTC TGC AAA CAC 
Glu Pro Ser Leu Asn Glu Glu Gin Asp Lys Arg Phe Val Cys Lys His 
350 365 370 

TCC ATG GTA GAC AGA GGA TGG GGA AAT GGA TGT GGA CTA TTT GGA AAG 
Ser Met Val Asp Arg Gly Trp Gly Asn Gly Cys Gly Leu Phe Gly Lys 
375 380 385 390 

r?i r?° tT S*? ££ C I GT GCT ATG TT( =.AGA TGC AAA AAG AAC ATG GAA 
Gly Gly He Val Thr Cys Ala Met Phe Arg Cys Lys Lys Asn Met Glu 
395 400 405 

GGA AAA GTT GTG CAA CCA GAA AAC TTG GAA TAC ACC ATT GTG ATA ACA 
Gly Lys Val Val Gin Pro Glu Asn Leu Glu Tyr Thr He Val He Thr 
41 0 415 " 420 

23 S AC c CA £ GG £ AA S AG CAT GCA GTC GGA NAT GAC ACA GG A AAA CAT 
Pro His Ser Gly Glu Glu His Ala Val Gly Xaa Asp Thr Gly Lys His 



834 



882 



930 



978 



1026 



1074 



1 122 



1 170 



1218 



1266; 



1 3 1 4- 



1362 



1410 
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GGC AAG GAA ATC AAA ATA ACA CCA CAG AGT TCC ATC ACA GAA GCA GAA 1458 
Gly Lys Glu He Lys He Thr Pro Gin Ser Ser He Thr Glu Ala Glu 
440 -445 450 

TTG ACA GGT TAT GGC ACT GTC ACA ATG GAG TGC TCT CCA AGA ACG GGC 1506 
Leu Thr Gly Tyr Gly Thr Val Thr Met Glu Cys Ser Pro Arg' Thr Gly 
455 460 465 470 

CTC GAC TTC AAT GAG ATG GTG TTG CTG CAG ATG GAA AAT AAA GCT TGG 1554 
Leu Asp .Phe Asn Glu Met Val Leu Leu Gin Met Glu Asn Lys Ala Trp 
475 480 485 

CTG GTG CAC AGG CAA TGG TTC CTA GAC CTG CCG TTA CCA TGG TTG CCC 1602 
Leu Val His Arg Gin Trp Phe Leu Asp Leu Pro Leu Pro Trp Leu Pro 
490 495 500 

GGA GCG GAC ACA CAA GGG TCA AAT TGG ATA CAG AAA GAG ACA TTG GTC 1650 
Gly Ala Asp Thr Gin Gly Ser Asn Trp He Gin Lys Glu Thr Leu Val 
505 510 515 

ACT TTC AAA AAT CCC CAT GCG AAG AAA CAG GAT GTT GTT GTT TTA GGA 1698 
Thr Phe Lys Asn Pro His Ala Lys Lys Gin Asp Val Val Val Leu Gly 
520 525 530 

TCC CAA GAA GGG GCC ATG CAC ACA GCA CTT ACA GGG GCC ACA GAA ATC 1746 
Ser Gin Glu Gly Ala Met His Thr Ala Leu Thr Gly Ala Thr Glu He 
535 540 545 550 

CAA ATG TCA TCA GGA AAC TTA CTC TTC ACA GGA CAT CTC AAG TGC AGG 1794 
Gin Met Ser Ser Gly Asn Leu Leu Phe Thr Gly His Leu Lys Cys Arg 
555 . 560 565 

CTG AGA ATG GAC AAG CTA CAG CTC AAA GGA ATG TCA TAC TCT ATG TGC 1842 
Leu Arg Met Asp Lys Leu Gin Leu Lys Gly Met Ser Tyr Ser Met Cys 
570 575 580 

ACA GGA AAG TTT AAA GTT GTG AAG GAA ATA GCA GAA ACA CAA CAT GGA 1890 
Thr Gly Lys Phe Lys Val Val Lys Glu He Ala Glu Thr Gin His Gly 
-585 590 595 

ACA ATA GTT ATC AGA GTG CAA TAT GAA GGG GAC GGC TCT CCA TGC AAG 1938 
Thr He Val He Arg Val Gin Tyr Glu Gly Asp Gly Ser Pro Cys Lys 
600 605 '610 

ATC CCT TTT GAG ATA ATG GAT TTG GAA AAA AGA CAT GTC TTA GGT CGC 1986 
He Pro Phe Glu He Met Asp Leu Glu Lys Arg His Val Leu Gly Arg 
615 620 " 625 630 

CTG ATT ACA GTC AAC CCA ATT GTG ACA GAA AAA GAT AGC CCA GTC AAC 2034 
Leu lie Thr . Val Asn Pro He Val Thr Glu Lys Asp Ser Pro Val Asn 
635 640 645 

ATA GAA GCA GAA CCT CCA TTT GGA GAC AGC TAC ATC ATC ATA GGA GTA 2082 
He Glu Ala Glu Pro Pro Phe Gly Asp Ser Tyr He He He Gly Val 
650 655 660 

GAG CCG GGA CAA CTG AAG CTC AAC TGG TTT AAG AAA GGA AGT TCT ATC 2130 
Glu Pro Gly Gin Leu Lys Leu Asn Trp Phe Lys Lys Gly Ser Ser He 
665 670 " * 675 
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GGC CAA ATG TTT GAG ACA ACA ATG AGG GGG GCG AAG AGA ATG GCC ATT 2178 

Gly Gin Met Phe Glu Thr Thr Met Arg~ Gly Ala Lys Arg Met Ala lie - 

680 685 690 

TTA GGT GAC ACA GCC TGG GAT TTT GGA TCC TTG GGA GGA GTG TTT ACA 2226 

Leu Gly Asp Thr. Ala Trp Asp Phe Gly Ser Leu Gly Gly Val Phe Thr 

695 * 700 705 710 

TCT ATA GGA AAG GCT CTC CAC CAA GTC TTT GGA GCA ATC TAT GGA GCT 2274 

Ser He Gly Lys Ala Leu His Gin Val Phe Gly Ala He Tyr Gly Ala 
715 720 725 

GCC TTC AGT GGG GTT TCA TGG ACT ATG AAA ATC CTC ATA GGA GTC ATT 2322 

Ala Phe Ser Gly Val Ser Trp Thr Met Lys He Leu He Gly Val He 
730 * 735 740 

ATC ACA TGG. ATA GGA ATG AAT TCA CGC AGC ACC TCA CTG TCT GTG ACA 2370 

He Thr Trp He Gly Met Asn Ser Arg Ser Thr Ser Leu Ser Val Thr 
745 " 750 755 

CTA GTA TTG GTG GGA ATT GTG ACA CTG TAT TTG GGA GTC ATG GTG CAG 2 418 

Leu Val Leu Val Gly He Val Thr Leu Tyr Leu Gly Val Met Val Gin 

760 765 770 

GCC GAT AGT GGT TGC GTT GTG AGC TGG AAA A AC AAA GAA CTG AAA TGT 2466 

Ala Asp Ser Gly Cys Val Val Ser Trp Lys Asn Lys Glu Leu Lys Cys 

775 " 780 785 790 

GGC AGT GGG ATT TTC ATC ACA GAC AAC GTG CAC ACA TGG ACA GAA CAA _ 251 4 

Gly Ser Gly He Phe lie Thr Asp Asn Val His Thr Trp Thr Glu Gin 
795 800 805 

TAC AAG TTC CAA CCA GAA TCC CCT TCA AAA CTA GCT TCA GCT ATC CAG 2562 

Tyr Lys Phe Gin Pro Glu Ser Pro Ser Lys Leu Ala Ser Ala lie Gin 
810 .815 820 

AAA GCC CAT GAA GAG GAC ATT TGT GGA ATC CGC TCA GTA ACA AGA CTG 2610 

Lys Ala His Glu Glu Asp He Cys Gly He Arg Ser Val Thr Arg Leu 
825 830 835 

GAG AAT CTG ATG TGG AAA CAA ATA ACA CCA GAA TTG AAT CAC ATT CTA 2658 

Glu Asn Leu Met Trp Lys Gin He Thr Pro Glu Leu Asn His He Leu 

840 845 850 

TCA GAA AAT GAG GTG AAG TTA ACT ATT ATG ACA GGA GAC ATC AAA GGA 2706 

Ser Glu Asn Glu Val Lys Leu Thr lie Met Thr Gly Asp lie Lys Gly 

855 860 865 870 

ATC ATG CAG GCA GGA AAA CGA TCT CTG CGG CCT CAG CCC ACT GAG CTG 2754 

He Met Gin Ala Gly Lys Arg Ser Leu Arg Pro Gin Pro Thr Glu Leu 
875 880 885 

AAG TAT TCA TGG AAA ACA TGG GGC AAA GCA AAA ATG CTC TCT ACA GAG 2802 

Lys Tyr Ser Trp Lys Thr Trp Gly Lys Ala Lys Met Leu Ser Thr Glu 
890 895 900 

TCT CAT NAC CAG ACC TTT CTC ATT GAT GGC CCC GAA ACA GCA GAA TGC 2850 

Ser His Xaa Gin Thr Phe Leu He Asp Gly Pro Glu Thr Ala Glu Cys 

905 910 915 
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CCC AAC ACA AAT AGA GCT TGG AAT TCG TTG GAA" GTT GAA GAC TAT GGC 2898 
Pro Asn Thr Asn Arg Ala Trp Asn Ser Leu. Glu Val Glu Asp Tyr Gly 
920 925 930 

TTT GGA GTA TTC ACC ACC AAT ATA TGG CTA AAA TTG AAA GAA AAA CAG 2946 
Phe Gly Val Phe Thr. Thr Asri lie Trp Leu Lys Leu Lys Glu Lys Gin • 
935 940 945 950 

GAT GTA TTC TGC GAC TCA AAA CTC ATG TCA GCG GCC ATA AAA GAC AAC 2994 
Asp Val Phe Cys Asp Ser Lys Leu Met* Ser Ala Ala -lie Lys Asp Asn 
955 960 965 

AGA GCC GTC CAT GCC GAT ATG GGT TAT TGG ATA GAA AGT GCA CTC NAT 3042 
Arg Ala Val His Ala Asp Met Gly Tyr Trp lie Glu Ser Ala Leu Xaa 
970 975 980 

GAC ACA TGG AAG ATA GAG AAA GCC TCT TTC ATT GAA GTT AAA AAC TGC . 3090 
Asp Thr Trp Lys lie Glu Lys Ala Ser Phe lie Glu Val Lys Asn Cys 
985 990 995 

CAC TGG CCA AAA TCA CAC ACC CTC TGG AGC AAT GGA GTG CTA GAA AGT 3138 
His Trp Pro Lys Ser His Thr Leu Trp Ser Asn Gly Val Leu Glu Ser 
1000 " 1005 1010 

GAG. ATG ATA ATT CCA AAG AAT CTC GCT GGA CCA GTG TCT CAA CAC AAC 3186 
Glu Met lie lie Pro Lys Asn Leu Ala Gly Pro Val Ser Gin His Asn 
1015 1020 1025 1030 

TAT AGA CCA GGC TAC CAT ACA CAA ATA ACA GGA CCA TGG CAT CTA GGT 3234 
Tyr Arg Pro Gly Tyr His Thr Gin lie Thr Gly Pro Trp His Leu Gly 
1035 1040 * 1045 

AAG CTT GAG ATG GAC TTT GAT TTC TGT GAT GGA ACA ACA GTG GTA GTG 3282 
Lys Leu Glu Met Asp Phe Asp Phe Cys Asp Gly Thr Thr . Val Val Val 
1050 1055 1060 

ACT GAG GAC TGC GGA AAT AGA GGA CCC TCT TTG AGA ACA ACC ACT GCC 3330 
Thr Glu Asp Cys Gly Asn Arg Gly Pro Ser Leu Arg Thr Thr Thr Ala 
1065. 1070 1075 

TCT GGA AAA CTC ATA ACA GAA TGG TGC TGC CGA TCT TGC ACA TTA CCA 3378 
Ser Gly Lys Leu lie Thr Glu Trp Cys Cys Arg Ser Cys Thr Leu Pro 
1080 1085 1090 

CCG CTA AGA TAC AGA GGT GAG GAT GGG TGC TGG TAC GGG ATG GAA ATC 3426 
Pro Leu Arg Tyr Arg Gly Glu Asp Gly Cys Trp Tyr Gly Met Glu lie 
1095 " 1100 1105 * 1110 

AGA CCA TTG AAG GAG AAA GAA GAG AAT TTG GTC AAC TCC TTG GTC ACA 3474 
Arg Pro Leu Lys Glu Lys Glu Glu Asn Leu Val Asn Ser Leu Val Thr 
1115 1120 1125 

GCT GGA CAT GGG CAG GTC GAC AAC TTT TCA CTA GGA GTC TTG GGA ATG 3522 
Ala Gly His Gly Gin Val Asp Asn Phe Ser Leu Gly Val Leu Gly Met 
1130 * 1135 1140 

GCA TTG TTC CTG GAG GAA ATG CTT AGG ACC CGA GTA GGA ACG AAA CAT 3570 
Ala Leu Phe Leu Glu Glu Met Leu Arg Thr Arg Val Gly Thr Lys His 
1145 1150 1155 
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GCA ATA CTA CTA GTT . GCA GTT TCT TTT GTG ACA TTG ATC ACA GGG AAC 3618 
Ala He Leu Leu Val Ala Val Ser Phe Val Thr Leu He Thr Gly Asn 
1160 1165 1170 

ATG TCC TTT AGA GAC CTG GGA AGA GTG ATG GTT ATG GTA GGC GCC ACT . 3666 
Met Ser Phe Arg Asp Leu Gly Arg Val Met Val Met Val Gly' Ala Thr 
1175 1180 1185 " 1190 

ATG ACG GAT GAC ATA GGT ATG GGC GTG ACT TAT CTT GCC CTA CTA GCA 3714 
Met Thr Asp Asp He Gly Met Gly Val Thr Tyr Leu Ala Leu Leu Ala 
1195 1200 1205 

GCC. TTC AAA GTC AGA CCA ACT TTT GCA GCT GGA CTA CTC TTG AGA AAG 3762 
Ala Phe Lys Val Arg Pro Thr Phe Ala Ala Gly Leu Leu Leu Arg Lys 
1210 1215 . 1220 

CTG ACC TCC AAG GAA TTG ATG ATG ACT ACT .ATA GGA ATT GTA CTC CTC 3810 
Leu Thr Ser Lys Glu Leu Met . Met Thr Thr He Gly lie Val Leu Leu 
1225 1230 1235 

TCC CAG AGC ACC ATA CCA GAG ACC ATT CTT GAG TTG ACT GAT GCG TTA 3858 
Ser Gin Ser Thr He Pro Glu Thr lie Leu Glu Leu Thr Asp Ala Leu 
1240 1245 1250 

GCC TTA GGC ATG ATG GTC CTC AAA ATG GTG AGA AAT ATG GAA AAG TAT 3906 
Ala Leu Gly Met Met Val Leu Lys Met Val Arg Asn Met Glu Lys Tyr 
1255 1260 1265 1270 

CAA TTG GCA GTG ACT ATC ATG GCT ATC TTG TGC GTC CCA AAC GCA GTG 3954 
Gin Leu Ala Val Thr He Met Ala He Leu Cys Val Pro Asn Ala Val 
1275 1280 1285 

ATA TTA CAA AAC GCA TGG AAA GTG AGT TGC ACA ATA TTG GCA GTG GTG " 4002 
He Leu Gin Asn Ala Trp Lys Val Ser Cys Thr He Leu Ala Val Val 
1290 * 1295 1300 

TCC GTT TCC CCA CTG TTC TTA ACA TCC TCA CAG CAA AAA ACA GAT TGG 4050 
Ser Val Ser Pro Leu Phe Leu Thr Ser Ser Gin Gin Lys Thr Asp Trp 
1305 1310 1315 

ATA CCA TTA GCA TTG ACG ATC AAA GGT CTC AAT CCA ACA GCT ATT TTT 4098 
lie Pro Leu Ala Leu Thr He Lys Gly Leu Asn Pro Thr Ala He Phe 
1320 1325 * 1330 

CTA ACA ACC CTC TCA AGA ACC AGC AAG AAA AGG AGC TGG CCA TTA AAT 4146 
Leu Thr Thr Leu Ser Arg Thr Ser Lys Lys Arg Ser Trp Pro Leu Asn 
1335 1340 1345 " 1350 

GAG GCT ATC ATG GCA GTC GGG ATG GTG AGC ATT TTA GCC AGT TCT CTC 4194 
Glu Ala He Met Ala Val Gly Met Val Ser lie Leu Ala' Ser Ser Leu 
1355 1360 1365 

CTA AAA AAT GAT ATT CCC ATG ACA GGA CCA TTA GTG GCT GGA GGG CTC 4242 
Leu Lys Asn Asp lie Pro Met Thr Gly Pro Leu Val Ala Gly Gly Leu 
1370 1375 1380 

CTC ACT GTG TGC TAC GTG CTC ACT GGA CGA TCG GCC GAT TTG GAA CTG 4290 
Leu Thr Val Cys Tyr Val Leu Thr Gly Arg Ser Ala Asp Leu Glu Leu 
1385 * 1390 1395 
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GAG AGA GCA GCC GAT GTC AAA TGG GAA GAC CAG GCA GAG ATA TCA GGA 4338 
Glu Arg Ala Ala Asp Val 'Lys Trp Glu Asp Gin Ala Glu lie Ser Gly 
1400 1405 * 1410 

AGC AGT CCA ATC CTG TCA ATA ACA ATA TCA GAA GAT GGT AGC ATG TCG 4386 
Ser Ser Pro He Leu Ser He Thr He Ser Glu Asp Gly Ser Met Ser 
1415 1420 1425 * 1430 

ATA AAA AAT GAA GAG GAA GAA CAA ACA CTG ACC ATA CTC ATT AGA ACA 4434 
He Lys Asn- Glu Glu Glu Glu Gin Thr Leu Thr He Leu He Arg Thr 
1435 1440 1445 

GGA TTG CTG GTG ATC TCA GGA CTT TTT CCT GTA TCA ATA CCA ATC ACG 4482 
Gly Leu Leu Val He Ser Gly Leu Phe Pro Val Ser lie Pro He Thr 
1450 1455 1460 

.GCA GCA GCA TGG TAC CTG TGG GAA GTG AAG AAA CAA CGG GCC GGA GTA 4.530 
Ala Ala Ala Trp Tyr Leu Trp Glu Val Lys Lys Gin Arg Ala Gly Val 
1465 1470 1475 

TTG TGG GAT GTT CCT TCA CCC CCA CCC ATG GGA AAG GCT GAA CTG GAA 4578 
Leu Trp Asp Val Pro Ser Pro Pro Pro Met Gly Lys Ala Glu Leu Glu 
1480 " 1485 1490 

GAT GGA GCC TAT AGA ATT AAG CAA AAA GGG ATT CTT GGA TAT TCC CAG 4626 
Asp Gly Ala Tyr Arg He Lys Gin Lys Gly lie Leu Gly Tyr Ser Gin 
1495 1500 1505 ~ 1510 

ATC GGA GCC GGA GTT TAC AAA GAA GGA ACA TTC CAT ACA ATG TGG CAT 4674 
lie Gly Ala Gly Val Tyr Lys Glu Gly Thr Phe His Thr Met Trp His 
1515 1520 1525 

GTC ACA CGT GGC GCT GTT CTA ATG CAT AAA GGA AAG AGG ATT GAA CCA 4722 
Val Thr Arg Gly Ala Val Leu Met His Lys Gly Lys Arg lie Glu Pro 
1530 1535 1540 

TCA TGG GCG GAC GTC AAG AAA GAC CTA ATA TCA TAT GGA GGA GGC TGG 4770 
Ser Trp Ala Asp Val Lys Lys Asp Leu lie Ser Tyr Gly Gly Gly Trp 
1545 1550 1555 

AAG TTA GAA GGA GAA TGG AAG GAA GGA GAA GAA GTC CAG GTA TTG GCA 4818 
Lys Leu Glu Gly Glu Trp Lys Glu Gly Glu Glu Val Gin Val Leu Ala 
.1560 1565 1570 

CTG GAG CCT GGA AAA AAT CCA AGA GCC GTC CAA ACG AAA CCT GGT CTT 4866 
Leu Glu Pro Gly Lys Asn Pro Arg Ala Val Gin Thr Lys Pro Gly Leu 
1575 1580 1585 1590 

TTC AAA ACC AAC GCC GGA ACA ATA GGT GCT GTA TCT CTG GAC TTT TCT 4914 . 
Phe Lys Thr Asn Ala Gly Thr He Gly Ala Val Ser Leu Asp Phe Ser 
1595 1600 1605 

CCT GGA ACG TCA GGA TCT CCA ATT ATC GAC AAA AAA GGA AAA GTT GTG 4962 
Pro Gly Thr Ser Gly Ser Pro lie lie Asp Lys Lys Gly Lys Val Val 
1610 1615 1620 

GGT CTT TAT GGT AAT GGT GTT GTT ACA AGG AGT GGA GCA TAT GTG AGT 5010 
Gly Leu Tyr Gly Asn Gly Val Val Thr Arg Ser . Gly Ala Tyr Val Ser 
1625 1630 * 1635 
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GCT ATA GCC CAG ACT GAA AAA AGC ATT GAA GAC AAC CCA GAG ATC GAA 5058 
Ala lie Ala Gin Thr Glu Lys Ser lie Glu Asp Asn Pro Glu lie Glu 
1640 1645 1650 

GAT GAC ATT TTC CGA AAG AG A AG A CTG ACC ATC ATG GAC CTC CAC CCA 5106 
Asp Asp He Phe Arg Lys Arg Arg Leu Thr He Met Asp Leu His Pro 
1655 * 1660 " 1665 1670 

GGA GCG GGA AAG ACG AAG AGA TAG CTT CCG GCC ATA GTC AGA GAA GCT 5154 
Gly Ala Gly Lys Thr Lys Arg Tyr Leu Pro Ala He Val Arg Glu Ala 
1675 \ 1680 1685 

ATA AAA CGG GGT TTG AGA ACA TTA ATC TTG GCC CCC ACT AGA GTT GTG 5202 
He Lys Arg Gly Leu Arg Thr Leu He Leu Ala Pro Thr Arg Val Val 
1690 1695 1700 

GCA GCT GAA ATG GAG GAA GCC CTT AGA GGA CTT CCA ATA AGA TAC CAG 5250 
Ala Ala Glu Met Glu Glu Ala Leu Arg Gly Leu Pro He Arg Tyr Gin 
1705 1710 1715 

ACC CCA GCC ATC AGA GCT GAG CAC ACC GGG CGG GAG ATT GTG GAC CTA 5298 
Thr Pro Ala He Arg Ala Glu His Thr Gly Arg Glu He Val Asp Leu 
1720 1725 ■ 1730 

ATG TGT CAT GCC ACA TTT ACC ATG AGG CTG CTA TCA CCA GTT. AGA GTG 5346 
Met Cys His Ala Thr Phe Thr Met Arg Leu Leu Ser Pro Val Arg Val 
1735 1740 1745 1750 

CCA AAC TAC AAC CTG ATT ATC ATG GAC GAA GCC CAT TTC ACA GAC CCA 5394 
Pro Asn Tyr Asn Leu He He Met Asp Glu Ala His Phe Thr Asp Pro 
1755 1760 1765 

GCA AGT ATA .GCA GCT AGA GGA TAC ATC TCA ACT CGA GTG GAG ATG GGT 5442 
Ala Ser .lie Ala Ala Arg Gly Tyr He Ser Thr Arg Val Glu Met Gly 
1770 1775 1780 

GAG GCA GCT GGG ATT TTT ATG ACA GCC ACT CCC CCG GGA AGC AGA GAC 5490 
Glu Ala Ala Gly He Phe Met Thr Ala Thr Pro Pro Gly Ser Arg Asp 
1785 1790 1795 

CCA TTT CCT CAG AGC AAT GCA CCA ATC ATA GAT GAA GAA AGA GAA ATC 5538 
Pro Phe Pro Gin Ser Asn Ala Pro He lie Asp Glu Glu Arg Glu He 
1800 1805 . 1810 

CCT GAA CGT TCG TGG AAT TCC GGA CAT GAA TGG GTC ACG GAT TTT AAA 5586 
Pro Glu Arg Ser Trp Asn Ser Gly His Glu Trp Val Thr Asp Phe Lys 
1815 1820 1825 1830 

GGG AAG 'ACT GTT TGG TTC GTT CCA AGT ATA AAA GCA GGA AAT GAT ATA 5634 
Gly Lys Thr Val Trp Phe Val Pro Ser He Lys Ala Gly Asn Asp He 
1835 1840 1845 

GCA GCT TGC CTG AGG AAA AAT GGA AAG AAA GTG ATA CAA CTC AGT AGG 5682 
Ala Ala Cys Leu Arg Lys Asn Gly Lys Lys Val lie Gin Leu Ser Arg 
1850 1855 1860 

AAG ACC TTT GAT TCT GAG TAT GTC AAG ACT, AGA ACC AAT GAT TGG GAC 5730 
Lys Thr Phe Asp Ser Glu Tyr Val Lys Thr Arg Thr Asn Asp Trp Asp 
1865 * 1870 1875 
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TTC GTG GTT ACA ACT GAC ATT TCA GAA ATG .GGT GCC AAT TTC AAG GCT 

Phe ,o«i Val Thr Thr As P Ile Ser Glu Met Gly Ala Asn Phe Lys Ala 
1880 1885 1890 

GAG AGG GTT ATA GAC CCC AGA CGC TGC ATG AAA CCA GTC ATA CTA ACA 
Glu Arg Val Ile Asp Pro Arg Arg Cys Met Lys Pro Val Ile Leu Thr 
18 95 1900 1905 1910 

GAT GGT GAA GAG CGG GTG ATT CTG GCA GGA CCT ATG CCA GTG ACC CAC 
Asp Gly Glu Glu Arg Val Ile Leu Ala Gly Pro Met Pro Val Thr His 
1915 1920 1925 

TCT AGT GCA GCA CAA AGA AGA GGG AGA ATA GGA AGA AAT CCA AAA AAT 
Ser Ser Ala Ala Gin Arg Arg Gly Arg Ile Gly Arg Asn Pro Lys Asn 
1930 1935 ~ 1940 

GAG AAT GAC CAG TAC ATA TAC ATG GGG GAA CCT CTG GAA AAT GAT GAA 
Glu Asn Asp Gin Tyr Ile Tyr Met Gly Glu Pro Leu Glu Asn Asp Glu 
1945 1950 1955 

GAC TGT GCA CAC TGG AAA GAA GCT AAA ATG CTC CTA GAT AAC ATC AAC 
Asp Cys Ala His Trp Lys Glu Ala Lys Met Leu Leu Asp Asn Ile Asn 
1960 1965 " 1970 

ACG CCA GAA GGA ATC ATT CCT AGC ATG TTC GAA CCA GAG CGT GAA AAG 
Pro Glu G1 y He Pro Ser Met Phe Glu Pro Glu Arg Glu Lys 

1 *75 1980 1985 1990 

GTG GAT GCC ATT GAT GGC GAA TAC CGC TTG AGA GGA GAA GCA AGG AAA 
Val Asp Ala Ile Asp Gly Glu Tyr Arg Leu Arg Gly Glu Ala Arg Lys 
1995 2000 ■ 2005 

ACC TTT GTA GAC TTA ATG AGA AGA GGA GAC CTA CCA GTC TGG TTG GCC 
Thr Phe Val Asp Leu Met Arg Arg Gly Asp Leu Pro Val Trp Leu Ala 
2010 2015 2020 

TAC AGA GTG GCA GCT GAA GGC ATC AAC TAC GCA GAC AGA AGG TGG TGT 
Tyr Arg Val Ala Ala Glu Gly lie Asn Tyr Ala Asp Arg Arg Trp Cys 
2025 2030 2035 

TTT GAT GGA GTC AAG AAC AAC CAA ATC CTA GAA GAA AAC GTG GAA GTT 
Phe Asp Gly Val Lys Asn Asn Gin Ile Leu Glu Glu Asn Val Glu Val 
2040 2045 2050 

GAA ATC TGG ACA AAA GAA GGG GAA AGG AAG AAA TTG AAA CCC AGA TGG 
Ile Trp Thr Lys Glu G1 * Glu Ar 9 L y s L ys Leu Lys Pro Arg Trp 



2060 



2065 



2070 



TTG GAT . GCT AGG ATC TAT TCT GAC CCA CTG GCG CTA AAA GAA TTT AAG 
Leu Asp Ala Arg Ile Tyr Ser Asp Pro Leu Ala Leu Lys Glu Phe Lys 
2075 2080 2085 

GAA TTT GCA GCC GGA AGA AAG TCT CTG ACC CTG AAC CTA ATC ACA GAA 
Glu Phe Ala Ala Gly Arg Lys Ser Leu Thr Leu Asn Leu Ile Thr Glu 
2090 2095 2100 

ATG GGT AGG CTC CCA ACC TTC ATG ACT CAG AAG GCA AGA GAC GCA CTG 
Met Gly Arg Leu Pro Thr Phe Met Thr Gin Lys Ala Arg Asp Ala Leu 
2105 2110 2115 



5778 



5826 



5874 



5922 



5970 



6018 



6066 



61 14 



6162 



6210 



6258 



6306 



6354 



6402 



6450 
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GAC AAC TTA GCA GTG CTG CAC ACG GCT GAG GCA GGT GGA AGG GCG TAC 6498 
Asp Asn Leu Ala Val Leu His Thr Ala Glu Ala Gly Glv Arcr Ala Tvr ^ 
2120 2125 2130 

AAC CAT GCT CTC AGT GAA CTG CCG GAG ACC CTG GAG ACA TTG CTT TTA 6546 
Asn His Ala Leu Ser Glu Leu Pro Glu Thr Leu Glu Thr Leu Leu Leu 
2135 2140 2145 2150 

CTG ACA CTT CTG GCT ACA GTC ACG GGA GGG ATC TTT TTA TTC TTG ATG 6594 
Leu Thr Leu- Leu Ala Thr Val Thr Gly Gly lie Phe Leu Phe Leu Met 
2155 2160 2165 

AGC GCA AGG GGC ATA GGG AAG ATG ACC CTG GGA ATG TGC TGC ATA ATC 6642 
Ser Ala Arg Gly He Gly Lys Met Thr Leu Gly Met Cys Cys He He 
2170 2175 2180 

ACG GCT AGC ATC CTC CTA TGG TAC GCA CAA ATA CAG CCA CAC TGG ATA 6690 
Thr Ala Ser He Leu Leu Trp Tyr Ala Gin tie Gin Pro His Trp He 
2185 2190 2195 

GCA GCT TCA ATA ATA CTG GAG TTT TTT CTC ATA GTT TTG CTT ATT CCA 6738 
Ala Ala Ser He He Leu Glu Phe Phe Leu Ile Val Leu Leu lie Pro 
2200 2205 2210 

GAA CCT GAA AAA CAG AG A ACA CCC CAA GAC AAC CAA CTG ACC TAC GTT 6786 
Glu Pro Glu Lys Gin Arg Thr Pro Gin Asp' Asn Gin Leu Thr Tyr Val 
221 5 2220 2225 2230 

GTC ATA GCC ATC CTC ACA GTG GTG GCC GCA ACC ATG GCA AAC GAG ATG 6834 
Val He ATa lie Leu Thr Val Val Ala Ala Thr Met Ala Asn Glu Met 
2235 2240 * 2245 

GGT TTC CTA GAA AAA ACG AAG AAA GAT CTC GGA TTG GGA AGC ATT GCA 6882 
Gly Phe Leu Glu Lys Thr Lys Lys Asp Leu Gly Leu Gly Ser lie Ala 
2250 2255 2260 

ACC CAG CAA CCC GAG AGC AAC ATC CTG GAC ATA GAT CTA CGT CCT GCA 6930 
Thr Gin Gin Pro Glu Ser Asn lie Leu Asp lie Asp Leu Arg Pro Ala 
2265 2270 2275 

TCA GCA TGG ACG CTG TAT GCC GTG GCC ACA ACA TTT GTT ACA CCA ATG 6978 
Ser. Ala Trp Thr Leu Tyr Ala Val Ala Thr Thr Phe Val Thr Pro Met 
2280 2285 2290 



TTG AGA CAT AGC ATT GAA AAT TCC TCA GTG AAT GTG TCC CTA ACA GCT 7026 

Ala 
2310 



V **** w^n. rxrx*. i^r\ \j X \J X OlO Ulrt ALA bLI 

g His Ser - Ile Glu Asn Ser Ser Val Asn V* 1 Ser Leu Thr Ala 
2295 2300 2305 " ' ~ 



ATA GCC AAC CAA GCC ACA GTG TTA ATG GGT CTC GGG AAA GGA TGG CCA 7074 
lie Ala Asn Gin Ala Thr Val Leu Met Gly Leu Gly Lys Gly Trp Pro 
2315 2320 2325 

TTG TCA AAG ATG GAC ATC GGA GTT CCC CTT CTC GCC ATT GGA TGC TAC 7122 
Leu Ser Lys Met Asp Ile Gly Val Pro Leu Leu Ala Ile Gly Cys Tvr 
2330 2335 2340 

TCA CAA GTC AAC CCC ATA ACT CTC ACA GCA GCT CTT TTC TTA TTG GTA 7170 
Ser Gin Val Asn Pro lie Thr Leu Thr Ala Ala Leu Phe Leu Leu Val 
2345 2350 2355 
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GCA CAT TAT GCC ATC ATA GGG CGA GGA" CTC CAA- GCA AAA GCA ACC AGA 7218 
Ala His Tyr Ala He He Gly Pro Gly Leu Gin Ala Lys Ala Thr Arg 
2360. 2365 . 2370 

GAA GCT CAG AAA AGA GCA GCG GCG GGC ATC ATG AAA AAC CCA ACT GTC 7266 
Glu Ala Gin Lys Arg Ala Ala Ala Gly He Met Lys Asn Pro Thr Val 
2375 2380 2385 2390 

GAT GGA ATA ACA GTG ATT GAC CTA GAT CCA ATA CCT TAT. GAT CCA AAG 7314 
Asp Gly He Thr Val lie Asp Leu Asp Pro lie Pro. Tyr Asp Pro Lys 
2395 2400 2405 

TTT GAA AAG CAG TTG GGA CAA GTA ATG CTC CTA GTC CTC TGC GTG ACT 7362 
Phe Glu Lys Gin Leu Gly Gin Val Met Leu Leu Val Leu Cys Val Thr 
2410 2415 2420 

CAA GTA TTG ATG ATG AGG ACT ACA TGG GCT CTG TGT GAG GCT TTA ACC 7410 
Gin Val Leu Met Met Arg Thr Thr Trp Ala Leu Cys Glu Ala Leu Thr 
2425 2430 2435 

TTA GCT ACC GGG CCC ATC TCC ACA TTG TGG GAA GGA AAT CCA GGG AGG 7458 
Leu Ala Thr Gly Pro lie Ser Thr Leu Trp Glu Gly Asn Pro Gly Arg 
2440 2445 2450 

TTT TGG AAC ACT ACC ATT GCG GTG TCA ATG GCT AAC ATT TTT AGA GGG 7506 
Phe Trp Asn Thr Thr lie Ala Val -Ser Met Ala . Asn He Phe Arg Gly 
2455 2460 2465 2470 

AGT TAC TTG GCC GGA GCT GGA CTT CTC TTT TCT ATT ATG AAG. AAC ACA 7 554 
Ser Tyr Leu Ala Gly Ala Gly Leu Leu Phe Ser He Met Lys Asn Thr 
2475 2480 2485 

ACC AAC ACA AGA AGG GGA ACT GGC AAC ATA GGA GAG ACG CTT GGA GAG 7 602 
Thr Asn Thr Arg Arg Gly Thr Gly Asn lie Gly Glu Thr Leu Gly Glu 
2490 2495 2500 

AAA TGG AAA AGC CGA TTG AAC GCA TTG GGA AAA AGT GAA TTC CAG ATC 7 650 
Lys Trp Lys Ser Arg Leu Asn Ala Leu Gly Lys Ser Glu Phe Gin lie 
2505 2510 * 2515 

TAC AAG AAA AGT GGA ATC CAG GAA GTG GAT AGA ACC TTA GCA AAA GAA 7 698 
Tyr Lys Lys Ser Gly lie Gin Glu Val Asp Arg Thr Leu Ala Lys Glu 
2520 . * 2525 * 2530 

GGC ATT AAA AGA GGA GAA ACG GAC CAT CAC GCT GTG TCG CGA GGC TCA 7746 
Gly He Lys Arg Gly Glu Thr Asp His His Ala Val Ser Arg Gly Ser 
2535 ~ 2540 2545 2550 

GCA AAA CTG AGA TGG TTC GTT.GAG AGA AAC ATG GTC ACA CCA GAA GGG 7794 
Ala Lys Leu Arg Trp Phe Val Glu Arg Asn Met Val Thr Pro Glu Gly 
2555 2560 2565 

AAA GTA GTG GAC CTC GGT TGT GGC AGA GGA GGC TGG TCA TAC TAT TGT 7 842 
Lys Val Val Asp Leu Gly Cys Gly Arg Gly Gly Trp Ser Tyr Tyr Cys 
2570 " " 2575 2580 

GGA GGA CTA AAG AAT GTA AGA GAA GTC AAA GGC CTA ACA AAA GGA GGA 7890 
Gly Gly Leu Lys Asn Val Arg Glu Val Lys Gly Leu Thr Lys Gly Gly 
2585 2590 * 2595 
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CCA GGA CAC GAA GAA CCC ATC CCC ATG TCA ACA TAT GGG TGG AAT CTA 7Q1ft 
*ovu 2605 2610 " . . 

GTG CGT CTT CAA AGT GGA GTT GAC GTT TTC TTC-ATC CCG CCA GAA AAr 7Qfl* 
Val Arg Leu Gin Ser Gly Val Asp Val PhS • J2 lit Pro PrMlS L^s 986 
2515 2620 2625 2630 

TGT GAC ACA TTA TTG TGT GAC ATA GGG GAG TCA TCA CCA AAT CCC ACA 8034 
Cys Asp Thr Leu Leu Cys Asp He Gly Glu Ser. Ser Pro Asn Pro Thr 
2635 2640 2645 

v=? r^ A ff A £ GA ? GA ACA CTC AGA GTC CTT AAC TTA GTA GAA AAT TGG 8082 
Val Glu Ala Gly Arg Thr Leu Arg Val Leu Asn Leu Val Glu Asn Trp 

2550 2655 2660 

TTG AAC AAC AAC ACT CAA TTT TGC ATA AAG GTT CTC AAC CCA TAT ATG 8130 
Leu Asn Asn Asn Thr Gin Phe Cys He Lys Val Leu Asn Pro Tyr Met 
2665 2670 2675 

CCC TCA GTC ATA GAA AAA ATG GAA GCA CTA CAA AGG AAA TAT GGA GGA 8178 

PrC \ff£ Val Ile Glu L * s Met Glu Ala L eu Gin Arg Lys Tyr Gly Gly 
2680 2685 2690 

Sfa SI? AAT - GCA f TC I CA CGA AAC TGG ACA CAT GAG ATG TAC 8226 

Ala Leu Val Arg Asn Pro Leu Ser Arg Asn- Ser Thr His Glu Met Tyr 
2695 2700 2705 2710 

55 w= A c CC AAT GCT F C GGG AAC ATA GTG TCA TCA GTG AAC ATG ATT 8274 
Trp Val Ser Asn Ala Ser Gly Asn Ile Val Ser Ser Val Asn Met lie 

2715 2720 2725 

V£ AGG w TG ? TG A T C AAC AGA TTT ACA ATG AGA TA C AAG AAA GCC ACT 8322 
Ser Arg Met Leu Ile Asn Arg Phe Thr Met Arg Tyr Lys Lys Ala Thr 
2730 2735 2740 

£ AG GAG GCG GAT GT T GAC CTC GGA AGC GGA ACG GG T AAC ATC GGG ATT 8370 
Tyr Glu Pro Asp Val Asp Leu Gly Ser Gly Thr Arg Asn Ile Gly He 
2745 2750 2755 

rt A AGT £ AG A ? A CCA AAC CTA GAT ATA ATT GGG AAA AGA ATA GAA AAA 8418 
Glu Ser Glu Ile Pro Asn Leu Asp Ile Ile Gly Lys Arg lie Glu Lys 
2760 2765 2770 " 

ATA AAG CAA GAG CAT GAA ACA TCA TGG CAC TAT GAC CAA GAC CAC CCA 8466 
lie Lys Gin Glu His Glu Thr Ser Trp His Tyr Asp Gin Asp His Pro 
2775 2780 2785 ~ . 2790 

TAC AAA ACG TGG. GCA TAC CAT GGT AGC TAT GAA ACA AAA CAG ACT GGA 8514 
Tyr Lys Thr Trp Ala Tyr His Gly Ser Tyr Glu Thr Lys Gin Thr Glv 
2795 2800 2805 

l^t GCA TCA TCC A TG GTC AAC GGA GTG GTC AGG CTG CTG ACA AAA CCT 8562 
ser Ala Ser Ser Met Val Asn Gly Val Val Arg Leu Leu Thr Lys Pro 
2810 2815 " 2820 

tS »«S u T ? H TC: £ CC ATG . GTG ACA CAG ATG GCA ATG ACA GAC ACG ACT 8610 
Trp Asp val Val Pro Met Val Thr Gin Met Ala Met Thr Asp Thr Thr 
2825 2830 2835 
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- CCA TTT GGA CAA CAG CGC GTT TTT AAA GAG AAA GTG GAC ACG AGA ACC 

Pr °o? 1 ?? Gly Gln Gln Ar 9 Val Phe L y s Glu Lys Val Asp Thr Arg Thr 
2840 2845 2850 

CAA GAA CCG AAA GAA GGC ACG AAG AAA CTA ATG AAA ATA ACA GCA GAG 
Gln Glu Pro Lys Glu Gly Thr Lys Lys Leu Met Lys Lie Thr Ala Glu 
2855 2860 2865 2870 

TGG CTT TGG AAA GAA TTA GGG AAG AAA AAG ACA CCC AGG ATG TGC ACC 
Trp Leu Trp Lys Glu Leu Gly Ly,^ Lys Lys Thr Pro Arg Met Cys Thr 
2875 • . 2880 2885 

AGA GAA GAA TTC ACA AGA AAG GTG AGA AGC AAT GCA GCC TTG GGG GCC 
Arg Glu Glu Phe Thr . Arg Lys Val Arg Ser Asn Ala Ala Leu Gly Ala 
2890 2895 2900 

ATA TTC ACT GAT GAG AAC AAG TGG AAG TCG GCA CGT GAG GCT GTT GAA 
lie Phe Thr Asp Glu Asn Lys Trp Lys Ser Ala Arg Glu Ala Val Glu 
2905 2910 2915 

GAT AGT AGG TTT TGG GAG CTG GTT GAC AAG GAA AGG AAT CTC CAT CTT 

P ->f?n Arg Phe Trp Glu Leu Val As P L V S Glu Arg Asn Leu His Leu 
2920 2925 2930 

GAA GGA AAG TGT GAA ACA TGT GTG TAC AAC ATG ATG GGA AAA AGA GAG 
Glu Gly Lys Cys Glu Thr Cys Val Tyr Asn Met Met Gly Lys Arg Glu 
2935 2940 . 2945 2950 

rt* AAG ? TA S? G GAA TTC GGC A* 0 GCA AAA GGC AGC AGA GCC ATA TGG 
Lys Lys Leu Gly Glu Phe Gly Lys Ala Lys Gly Ser Arg Ala He Trp 
2955 2960 " 2965 

™ ATG TGG CTT GGA GCA CGC TTC TTA GAG TTT GAA GCC CTA GGA TTC 
Tyr Met Trp Leu Gly Ala Arg Phe Leu Glu Phe Glu Ala Leu Gly Phe 
297.0 2975 2980 

TTA AAT GAA GAT CAC TGG TTC TCC AGA GAG AAC TCC CTG AGT GGA GTG 
Leu Asn Glu Asp His Trp Phe Ser Arg Glu Asn Ser Leu Ser Gly Val 
2985 2990 2995 • 

GAA GGA GAA GGG CTG CAC AAG CTA GGT TAC ATT CTA AGA GAC GTG AGC 

™i£ G1U Gly L6U His ^ ys Leu Gly Tyr Ile Leu Arg Asp Val Ser 
3000 3005 3010 

AA f ^ A £ AG S? A S GA GCA ATG TAT GCC GAT GAC ACC GCA GGA TGG GAT 
Lys Lys Glu Gly Gly Ala Met Tyr Ala Asp Asp Thr Ala Gly Trp Asp 

3020 3025 3030 

ACA AGA ATC ACA CTA GAA GAC KKA AAA AAT GAA GAA ATG GTA ACA AAC 
Thr Arg He Thr Leu Glu Asp Xaa Lys Asn Glu Glu Met Val Thr Asn 
3035 3040 3045 

CAC ATG GAA GGA GAA CAC AAG AAA CTA GCC GAG GCC ATT TTC AAA CTA 
His Met Glu Gly Glu His Lys Lys Leu Ala Glu Ala Ile Phe Lys Leu 
3050 3055 3060 

ACG TAC CAA AAC AAG GTG GTG CGT GTG CAA AGA CCA ACA CCA AGA GGC 
Thr Tyr Gln Asn Lys Val Val Arg Val Gln Arg Pro Thr Pro Arg Gly 
3065 3070 3075 
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3200 320 5 

' GTT GTT CCA TGT AG/ 
Val Val Pro Cys Ar^ 
3215 .3220 

a ijsss ni a s j| a a IK- a a a a s 



9.474- 
9522 



. a a a S| g ffi g - S5-2S Si ,g 85 a © 3 •« 
g % a S? a S a S SS S Si 2 »«« 

as a a: a a s a a ess a s a a is 
s a a a a a a a a a sgs a 
s a a a a a a § a a a a fa a a 
a a a a a a a s a a ass 22 a a 
a a a s a a a a a a a a a a a a 9666 

3185 3190 

a a a a a a a a a a s a a a a a ™ 

3200 3205 

a a a a a a a as a a a a a a a 

•J «£ I J O "> n . 



9762 



9810 



* y 

3230 3235 

TTG CGG GAG ACG GCC TGT TTG GGG AAG TCT TAP rrr ran *m„ m ~ 
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ACT CCA GTG GAA TCA TGG GAG GAA ATC CCA TAC TTG GGG AAA AGA GAA 1 0098 
Thr Pro Val Glu Ser Trp 'Glu. Glu He Pro Tyr Leu Gly Lys Arg Glu 
3320 3325 3330 

GAC CAA TGG TGC GGC TCA TTG ATT GGG TTA ACA AGC AGG GCC ACC TGG 10146 
Asp Gin Trp Cys Gly Ser Leu He Gly Leu Thr Ser Arg Ala' Thr Trp . 
3335 3340 3345 3350 

GCA A AG- AAC ATC CAA GCA GCA ATA AAT CAA - GTT AGA TCC CTT ATA GGC 10194 
Ala Lys Asn.Ile Gin Ala Ala He Asn Gin Val Arg Ser Leu He Gly 
3355 3360 3365 

AAT GAA GAA TAC ACA GAT TAC .ATG CCA TCC ATG AAA AGA TTC AGA AGA 1 0242 
Asn Glu Glu Tyr Thr Asp Tyr Met Pro Ser Met Lys Arg Phe Arg Arg 
3370 3375 3380 

. GAA GAG GAA GAA GCA GGA GTT CTG TGG TAGAAAGCAA* AACTAACATG AAACAAGG 
Glu Glu Glu Glu Ala Gly Val Leu Trp 

3385 3390 

CTAGAAGTCA GGTCGGATTA AGCCATAGTA CGGAAAAAAC TATGCTACCT GTGAGCCCCG 

TCCAAGGACG TTAAAAGAAG TCAGGCCATC ATAAATGCCA TAGCTTGAGT AAACTATGCA 

GCCTGTAGCT CCACCTGAGA AGGTGTAAAA AATCCGGGAG GCCACAAACC ATGGAAGCTG 

TACGCATGGC GTAGTGGACT AGCGGTTAGA GAGGACCCCT CCCTTACAAA TCGCAGCAAC" 

AATGGGGGCC CAAGGCGAGA TGAAGCTGTA GTCTCGCTGG AAGGACTAGA GGTTAGAGGA 

GACCCCCC.CG AAACAAAAAA CAGCATATTG ACGCTGGGAA AGACCAGAGA TCCTGCTGTC 

TCCTCAGCAT" CATTCCAGGC ACAGAACGCC AGAAAATGGA ATGGTGCTGT TGAATCAACA 

• GGTTCT 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 25 base pairs 

. (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 
■(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCCAGTCACG ACGTTGTAAA ACGAC 2 5 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
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GGATGTGCTG CAAGGCGATT AAGTTGG 

(2 ) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS' 
(A) LENGTH: 25 base pairs 
.(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
{iii) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xl) SEQUENCE DESCRIPTION:" SEQ ID NO: 5: 
TGAGCGGATA ACAATTTCAC ACAGG 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GGCTTTACAC TTTATGCTTC CGGCTCG 

2' 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 75 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

gScgaSS IacIg CTCTA gaaatttaat acgactcact ATAAGTTGTT AGTCTACGTG 60 

75 

(2) INFORMATION FOR SEQ ID NO: 8: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 7 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

CCAGTGAATT CGAGCTCACG CGTAAATTTA ATACGACTCA CTATAAGTTG TTAGTCTACG 60 
TGGACCGACA AAGACAG 7 7 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE; 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
AGTTGTTAGT CTACGTGGAC CGAC 2 4 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 0 : 
GACAGATTCT TTGAGGGAGC TGAGCTCAAC GTAG 34 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
TCAATATGCT GAAACGCGAG AGAAACCG 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 12 
GGGATTGTTA GGAAACGAAG GAACGC 

(2) INFORMATION FOR SEQ ID NO : 1 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 3 : 
CCACCAACAG CAGGGATACT GAAAAGATGG GG 

(2) INFORMATION FOR SEQ ID NO: 14: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



WO 96/40933 



PCT/US96/09209 



25 



• •• 140 ' • 

"(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1.4: 
TGCAGATCTG CGTCTCCTAT TCAAG 

(2) INFORMATION FOR SEQ ID NO:15: 

(i) SEQUENCE CHARACTERISTICS' 
A) LENGTH: 25 base pairs 

B TYPE: nucleic acid 
(C STRANDEDNESS: single ' 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE * cDNA 
lii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE* 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
CGTGAACATG TGTACCCTCA TGGCC 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS- 
A LENGTH : 26 base pairs 
B TYPE: nucleic acid 

n ^ NDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ill) HYPOTHETICAL: NO 
iv) ANTISENSE: NO 
. (V) FRAGMENT TYPE • 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: ' 
TTGCACCAAC AGTCAATGTC TTCAGG 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS- 
(A LENGTH: 25 base pairs ' 

B TYPE: nucleic acid 
#m STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(ill) HYPOTHETICAL: NO 
(iv) ANTISENSE: NO 

(V) FRAGMENT TYPE* 

(VI) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 7 : 
ACCAGAAGAC ATAGATTGTT GGTGC 

25 

(2) INFORMATION FOR SEQ ID NO : 1 8 : 
(i) SEQUENCE CHARACTERISTICS: ' 
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. (A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI5ENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 
GCACCAACAG TCTATGTCTT CTGGC 

(2) INFORMATION FOR SEQ ID NO: 19:. 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY:, linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 9 : 
ATGTTTCCAG GCCCCTTCTG ATGAC 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) . HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
GCAGCAATCC TGGCATACAC CATAG 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 
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(iv) .ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21 
GGTTGACATA GTCTTAGAAC ATGGAAG 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE : * cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22 
CTTCCATGTT CTAAGACTAT GTCAACC 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 
GTCTTAGAAC ATGGAAGTTG TGTGACGACG ATGGC 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 
ACAACAGAAT CTCGCTGCCC AACAC 
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(2) INFORMATION FOR SEQ ID-NO:25:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base r>airs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI5ENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25 
GCAAACACTC CATGGTAGAC AGAGG 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
■ (D) TOPOLOGY: : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ. ID NO: 26 
CCTCTGTCTA CCATGGAGTG TTTGC 

(2) INFORMATION FOR SEQ ID NO: 27: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27; 
CCACATCCAT TTCCCCATCC TCTGTCT 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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- (ii) MOLECULE TYPE: CDNA' 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
GGAAAGGGAG GCATTGTGAC CTGTGCTATG 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 
GG A AATC AAA ATAACACCAC AGAGTTCC 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) "SEQUENCE CHARACTERISTICS-: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) . TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30 
CTGCAGCAAC ACCATCTCAT TGAAGTCGAG GCCC 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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.(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 
GACTTCAATG AGATGGTGCT GCTGC 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) .STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GCAGCAGCAC CATCTCATTG AAGTC 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
AAGCTTGGCT GGTGCACAGG CAATGGTT 

(2) INFORMATION FOR SEQ ID NO: 34:. 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
TGGTAACGGC AGGTCTAGGA ACCATTG 

(2) INFORMATION FOR SEQ ID NO: 35: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
'(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 
GGACATCTCA AGTGCAGGCT GAG 

(2). INFORMATION FOR SEQ ID NO: 36: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : " s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) . HYPOTHETICAL: NO 

(iv) ANTISENSE: ' NO ■ 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 
CTCAGCCTGC ACTTGAGATG TCC 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
.(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 
GAAGGAAATA GCAGAAACAC AACATGG 

(2) INFORMATION FOR SEQ ID NO: 38: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
CCCTTCATAT TGTACTCTGA TAACTATTGT TCC 33 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CCTCCATTCG GAGACAGCTA CATCATCATA GG - 32 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 40: 
CCTATGATGA TGTAGCTGTC TCCGAATGGA GG 32 
(2) INFORMATION FOR SEQ ID NO: 41: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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, (xi} SEQUENCE^ DESCRIPTION: SEQ ID NO: 41 
ATGGCCATTT TAGGTGACAC AGCCTGGGA 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : ' 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ . ID NO: 42 
TGTAAACACT CCTCCCkGGG ATCCAAA 

(2) INFORMATION FOR SEQ ID NO:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
CTCATAGGAG TCATTATCAC ATGGATAGG 

(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
GGGGATTCTG GTTGGAACTT ATATTGTTCT GTCC 

(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS; 
- (A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
■TGATTCAATT CTGGTGTTAT TTGTTTCCAC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
AAGGAATCAT GCAGGCAGGA AAACG 

(2) INFORMATION . FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
ACTTCCAGCG AGTTCCAAGC TC 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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( iii ) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
AACAGAGCCG TCCATGCCGA TATGG 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
TCCATTGCTC CAAAGGGTGT GT 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
AGCTTGAGAT GGACTTTGAT' TTCTG 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION :' SEQ ID NO: 51 
GGTCTGATTT CCATCCCGTA CC 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
GTCCTTTAGA GACCTGGGAA GAG 

(2) INFORMATION FOR SEQ ID NO:53: 

(i) SEQUENCE CHARACTERISTICS: 
•(A) LENGTH: 25 base pairs - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA " 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(vi ) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
GTTTTCTCAA GAGTAGTCCA GCTGC 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54; 
ATCAATTGGC AGTGACTATC ATGGC 

(2) INFORMATION FOR SEQ ID NO: 55: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55 
TGTTAAGAGC AGTGGAGAAA CGGAC 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
. (iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi ) * SEQUENCE DESCRIPTION: SEQ ID NO:56 
GATTGAGACC TTTGATCGTC AACGC 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
■(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 
TGACAGGACC ATTAGTGGCT GGAGG 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE : 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58 
CGTGCTCACT GGACGATCGG CCGATTTGGA ACTG 

(2) INFORMATION FOR SEQ ID NO: 59: ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:59: 
GGGCTGCTTC CTGATATTTC TGCC 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
CCTGTGGGAA GTGAAGAAAC AACGG 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61 
GCTCCATCTT CCAGTTCAGC CTTTCCCATG 

(2) INFORMATION FOR SEQ ID NO: 62: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 25 base pairs . 

(B) TYPE.: nucleic acid 

(C) STRANDEDNESS: single 
■(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62 
CTCCGGCTCC AATCTGAGAG TATCC 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

( xi ) ■ SEQUENCE DESCRIPTION: SEQ ID NO: 63 
CCTAATATCA TATGGAGGAG GCTGG 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
GAAGGAGAAG AAGTCCAGGT ATTGG 



(2) INFORMATION FOR SEQ ID NO: 65: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65 
CTGTCGACAA TTGGAGATCC TGACG 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL': NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 66 
GTGGAGCATA TGTGAGTGCT ATAGC 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA . 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: . 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 
TCTGACTATG GCCGGAAGGT ATCTC . 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) , MOLECULE TYPE:' cDNA . 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68 
ACATTAATCT TGGCCCCCAC TAG AG 

(2) INFORMATION FOR SEQ' ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
.(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69 
CGATCTCCCG CCCGGTGTG 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: . 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CTAACTGGTG ATAGCAGCCT CATGG 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO . 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
CCTACTGAGT TGTATCACTT TCTTTCC 

(2) INFORMATION FOR SEQ' ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
TGGATTTCTT CCTATTCTCC CTCTTC 

(2) INFORMATION FOR SEQ ID. NO:73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) -ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
TTCAAGGCTG AGAGGGTTAT AGACC 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

■(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
TCTGGTTGGC CTACAGAGTG GCAGC 

(2) INFORMATION FOR SEQ ID NO: 75: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) ' FRAGMENT TYPE: . 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75 
CCTTCTTTTG TCCAGATTTC CACTTCC 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76 
GCGTACAACC ATGCTCTCAG TGAACTGCCG GAGAC 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
-(D) TOPOLOGY: linear 

(ii.) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 
TTCCCAGGGT CATCTTCCCT ATAC 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION :' SEQ ID NO: 78 
GATGCTAGCC GTGATTATGC AGCACATTCC C 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
AAACAGAGAA CACCCCAAGA CAACC 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT- TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
CGGCATACAG CGTCCATGCT G 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 



WO 96/40933 



160 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:81 
GTCTCGGGAA AGGATGGCCA TTGTC 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82 
CTCTGGTTGC TTTTGCTTGA AGTCC 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
' (iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 
CCGCCGCTGC TCTTTTCTGA GCTTCTC 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84 
AGGACTACAT GGGCTCTGTG TGAGG 



(2) INFORMATION FOR SEQ ID NO: 85: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: ■ 

(vi) . ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85 
GAGAAGTCCA GCTCCGGCC 

(2) INFORMATION FOR SEQ ID NO: 86: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL-: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 
AGAGAAACAT GGTCACACCA GAAGG 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE:- nucleic acid 

(C) STRANDEDNESS: single 
(D.) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
GTTCTTCGTG TCCTGGTCCT CC 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
■(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 
GGAAATATGG AGGAGCCTAG TGAGG 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89 
ACCCAGTACA TCTCATGTGT. GG 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90 
GAGCATGAAA CATCATGGCA CTATGACC 

(2) INFORMATION FOR SEQ ID NO:91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID. NO: 9 1 
TCATGGCACT ATGACCAAGA CCACC 

(2) INFORMATION FOR SEQ ID NO: 92: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
.(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92 
CAGTCTGACC ACTCCGTTCA CC 

(2) INFORMATION FOR SEQ ID NO:93: 

(i) SEQUENCE CHARACTERISTICS: 
~ (A) LENGTH: 25 base " pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 
AAGGTGAGAA GCAATGCAGC CTTGG 

(2) INFORMATION FOR SEQ ID NO:94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

- (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94 
GGGCCATATT CACTGATGAG AACAAGTGG 

(2) INFORMATION FOR SEQ ID NO: 95: 
. (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95 
TCTTTCCCTG TCAACCAGCT CC 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
AATGAAGATC ACTGGTTCTC CAGAG 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS-: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 
ACGTGAGCAA GAAAGAGGGA GGAGC 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 
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(iv) ANTISENSE: NO 
- (v) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE.: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:,98 
TGTCCCATCC TGCTGTGTCA TC 

. (2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
GCTAGTTTCT TGTGTTCTCC TTCCATGTGG 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: ■ 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 100 
TCATATCGAG AAGAGACCAA AGAGG 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE' DESCRIPTION: SEQ ID NO: 101 
ACTCCTTCTC CCTCCATCTG TCTG 
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(.2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
ATGCTTTTGA AGATTCCTTC TCCCTCC 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
GCACAGCGAT TTCTTCTGTG ATTGTTAGGT GC 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
ACAATGGGAA CCTTCAAGAG GATGG 

(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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. (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI SENSE : NO 

(v) FRAGMENT TYPE:' 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: 
TTATCACATT GGATCCTTCA AGAGGATGGA ATGATTGGAC ACAAG 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
CAGAAGGGCA CTTGTGTCCA ATCATTCC 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
, -(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
CTCCCTGGGA AATTCGGGCT C 

(2) INFORMATION FOR SEQ ID NO: 108: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
CCGTCTCCCG CAAAGACCAC CCTGCTCC 

(2) -INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: ' NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

TTATCACCTA TCTAGACCGT CTCCCGCAAA GACCACCCTG CTCC 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 
- (A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

• (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:110: 
GTTGGAACCC AATGTGATGG TACTGC 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
"(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:111: 
ACAAGTCGAA CAACCTGGTC CATAC 

(2) INFORMATION FOR SEQ ID NO: 112: 
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- (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single . 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) . FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112 
GCATGTCTTC CGTCGTCATC C 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO ' 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113 
CTTGAATCCA CACCCTGTTC CAGAC 

(2) INFORMATION FOR SEQ ID NO: 114: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 
(Vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 4: 
ATACACAGAT TACATGCCAT CCATG 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

•(ii) MOLECULE TYPE: cDNA 
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(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
. (v) FRAGMENT TYPE: 

(VI) ORIGINAL SOURCE: 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : .11 5 
TTTTGCCTTC TACCACAGGA C 

(2) INFORMATION FOR SEQ ID NO: 116: " 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : cDNA ■ 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 6 : 
GAAACAAGGC TAGAAGTCAG GTCGG 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

.(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: ■ 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
GACGGGGCTC ACAGGTAGCA TAG 

( 2 ) INFORMATION FOR SEQ ID NO : 1 1 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID . NO : 11 8 : 
GCCTGTAGCT CCACCTGAGA AGGTG 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
-(C) STRANDEDNESS: single 

*(D) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
GGAAGCTGTA CGCATGGCGT AGTGG 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:- 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
GGGCCCCCGT TGTTGCTGC 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 
(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
AGAACCTGTT GATTCAACAG CACCATTCCA TTTTCTG 

(2) INFORMATION FOR SEQ ID NO: 122: 
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• (i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO - 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 
TTATCACCTA GCATGCTCTA GAAGAACCTG TTGATTCAAC AGCACCATTC CATTTTCTG 59 
- (2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 52 base pairs * 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE : 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 
TTATCACCTA TCTAGAGAAC CTGTTGATTC AACAGCACCA TTCCATTTTC TG 52 
(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS' 

(A) -LENGTH: 2394 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(Hi) HYPOTHETICAL: NO 
(XV) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE : 

(A) NAME/KEY: Coding Sequence 

(B) LOCATION: 1 . . .2394 
(D) OTHER INFORMATION: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

SfS S S g|* s 2S E S! m S5 51? ££ 2S SS 48 

• 10 15 
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ATG . GCT TTC ATA GCA TTC TTA AG A TTT CTA^ GCC ATA CCC CCA ACA GCA. 

Met Ala Phe. lie Ala Phe Leu Arg Phe Leu Ala lie Pro Pro Thr Ala 
20 25 30 

GGA ATT TTG GCT AGA TGG GGC TCA TTC AAG AAG AAT GGA GCG ATT AAA 

Gly lie Leu Ala Arg Trp Gly. Ser Phe Lys Lys Asn Gly Ala lie Lys 

35 40 45 ' 

GTG TTA CGG GGT' TTC AAG AGA GAA ATC TCA AAC ATG CTA AAC ATA ATG 

Val Leu Arg Gly Phe Lys Arg Glu lie Ser Asn Met Leu Asn lie Met 
50 -55 -60 

AAC AGG AGG AAA AGA TCC GTG ACC ATG CTC CTT ATG CTG CTG CCC ACA 

Asn Arg Arg Lys . Arg Ser Val Thr Met Leu Leu Met Leu Leu Pro Thr 
65 70 75 80 

GCC CTG GCG TTC CAT CTG ACG ACA CGA GGG GGA - GAG CCG ' CAT ATG ATA 

Ala Leu Ala Phe His Leu Thr Thr Arg Gly Gly Glu Pro His Met lie 
85 90 95 

GTT AGC AAG CAG GAA AGA GGA AAG TCA CTT TTG TTC AAG ACC .TCT. GCA 

Val Ser Lys Gin Glu Arg Gly Lys Ser Leu Leu Phe Lys Thr Ser Ala 
100 ' 105 110 

GGT GTC AAC ATG TGC ACC CTC ATT GCG ATG GAT TTG GGA GAG TTG TGT 

-Gly Val Asn Met Cys Thr Leu lie Ala Met Asp Leu Gly Glu Leu Cys 

115 * 120 125 

GAG GAC ACG ATG ACC TAC AAA TGC CCC CGG ATC ACT GAG GCG GAA CCA 

Glu Asp Thr'Met Thr Tyr Lys Cys Pro Arg lie Thr Glu Ala Glu Pro 

. 130 135 140 

GAT GAC GTT GAC TGT TGG TGC AAT GCC ACG GAC ACA TGG GTG ACC TAT 

Asp Asp Val Asp Cys Trp Cys Asn Ala Thr Asp Thr Trp Val Thr Tyr 
145 150 155 160 



96 



144 



192 



240 



288 



336 



384 



432 



480 



GGA ACG TGC TCT CAA ACT GGC GAA CAC CGA CGA GAC AAA CGT TCC GTC 528 
Gly Thr Cys Ser Gin Thr Gly Glu His Arg Arg Asp Lys Arg Ser Val 
165 170 175 

GCA TTG GCC CCA CAC GTG GGG CTT GGC CTA GAA ACA AGA GCC GAA ACG 57 6 

Ala Leu Ala Pro His Val Gly Leu Gly Leu Glu Thr Arg Ala Glu Thr 
180 * 185 190 

TGG ATG TCC TCT GAA GGT GCT TGG AAA CAG ATA CAA AAA GTA GAG ACT 624 
Trp Met Ser Ser Glu Gly Ala Trp Lys Gin lie Gin Lys Val Glu Thr 

195 200 205 — 

TGG GCT CTG AGA CAT CCA GGA TTC ACG GTG ATA GCC CTT TTT CTA GCA 672 
Trp Ala Leu Arg His Pro Gly Phe Thr Val He Ala Leu Phe Leu Ala 
210 ' 215 220 

CAT GCC ATA GGA ACA TCC ATC ACC CAG AAA GGG ATC ATT TTC ATT TTG 720 
His Ala He Gly Thr Ser He Thr Gin Lys Gly He He Phe He Leu 
225 ^ 230 235 240 

CTG ATG CTG GTA ACA CCA TCT ATG GCC ATG CGA TGC GTG GGA ATA GGC 768 
Leu Met Leu Val Thr . Pro Ser Met Ala Met Arg Cys Val Gly He Gly 
245 250 255 
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AAC AGA . GAC TTC GTG GAA GGA CTG TCA GGA GCA. ACA TGG GTG GAT GTG 816 

Asn Arg Asp Phe Val Glu Gly Leu Ser Gly Ala Thr Trp Val Asp Val 
260 265 * "270 

GTA CTG GAG CAT GGA AGT TGC GTC ACC ACC ATG GCA AAA AAC AAA CCA 864 
Val Leu Glu His Gly Ser Cys Val Thr Thr Met Ala Lys Asn Lys Pro 
275 2B0 285 

ACA CTG GAC ATT GAA CTC TTG AAG ACG GAG GTC ,ACA AAC CCT GCA GTT 9 1 2 

Thr Leu Asp lie Glu Leu Leu Lys Thr Glu Val Thr Asn Pro Ala Val 
290 295 300 

CTG CGT AAA TTG TGC ATT GAA GCT AAA ATA TCA AAC ACC ACC ACC GAT 960 
Leu Arg Lys Leu Cys He Glu Ala Lys He Ser Asn Thr Thr Thr Asp 
305 310 ' 315 320 

TCG AGA TGT CCA ACA CAA GGA GAA GCC ACA CTG GTG GAA GAA CAA GAC 1 O08 
Ser Arg Cys Pro Thr Gin Gly Glu Ala Thr Leu Val Glu Glu Gin Asp 
325 330 335 

GCG AAC TTT GTG TGC CGA CGA ACG TTC GTG GAC AGA GGC TGG GGC AAT 1056 
Ala Asn Phe Val Cys Arg Arg Thr Phe Val Asp Arg Gly Trp Gly Asn 
340 345 350 

GGC TGT GGG CTA TTC GGA AAA GGT AGT CTA ATA ACG TGT. GCC AAG TTT 1 104 
Gly Cys Gly- Leu Phe Gly Lys Gly Ser Leu He Thr Cys Ala Lys Phe 
355 360 365' ' 

AAG TGT GTG ACA AAA CTA GAA GGA AAG ATA GCT CAA TAT GAA AAC CTA 1152 
Lys Cys Val Thr Lys Leu Glu Gly Lys He Ala Gin Tyr Glu Asn Leu 
370 375 380 

AAA TAT TCA GTG ATA GTC ACC GTC CAC ACT GGA GAT CAG CAC CAG GTG 1200 
Lys Tyr Ser Val . He Val Thr Val His Thr Gly Asp Gin His Gin Val 
385 390 395 400 

GGA AAT GAG ACT ACA GAA CAT GGA ACA ACT GCA ACC ATA ACA CCT CAA 12 48 

Gly Asn Glu Thr Thr Glu His Gly Thr Thr Ala Thr lie Thr Pro Gin 
405 410 415 

GCT CCT ACG TCG GAA ATA CAG CTG ACC GAC TAC GGA ACC CTT ACA TTA 1296 
Ala Pro Thr Ser Glu lie Gin Leu Thr Asp Tyr Gly Thr Leu Thr Leu 
420 425 430 

GAT TGT TCA CCT AGG ACA GGG CTA GAT TTT AAC GAG ATG GTG TTG CTG 1344 
Asp Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn Glu Met Val Leu Leu 
435 ~ 440 445 

ACA ATG AAA AAG AAA TCA TGG CTT GTC CAC AAA CAG TGG TTT CTA GAC 13 92 
Thr Met Lys Lys Lys Ser Trp Leu Val His Lys Gin Trp Phe Leu Asp 
450 455 460 

TTA CCA CTG CCT TGG ACC TCT GGG GCT TTA ACA TCC CAA GAG ACT TGG 1440 
Leu Pro Leu Pro Trp Thr Ser Gly Ala Leu Thr Ser Gin Glu Thr Trp 
465 470 475 480 

AAC AGA CAA GAT TTA CTG GTC ACA TTT AAG ACA GCT CAT GCA AAG AAG 1488 
Asn Arg Gin Asp Leu Leu Val Thr Phe Lys Thr Ala His Ala Lys Lys 
485 490 495 
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CAG GAA GTA GTC GTA CTA GGA TCA CAA GAA rra ppa . 

Gin ■«» val Val val Leu g-gS US g} j» gj ACT jog ,536 

505 510 

s. g if ? as a a sg « s as s s s s £ - 

520 525 

s i as ss s?s $ § a. s?s a sg & s a % - 
if a-s ss a a a s s a a a a a s: a - 

555 560 

- sss s sg si ss.s ss ss a vir . jg a 
if? a as a- s # s a S s s us g S s?s 

SS.S2 i£ Gin JS S? ArS Leu £ iS 2S A*S„ £ M & 

as $ as jjj s 3 sg s a s si g « s c a 

|? & SS SS vai U? S2 g US # |f| s m SS S£ ?| '»» 

s 5! l-s if? ss ii? s: s? & a S us s is s S 

6o0 555 

§?? s a g s s s s if? sjg a is s s s; 

6 °5 670 

s s §f? u? ss s sg || Met © & a ® «s as ss *>« 

bao 685 

S f| £ S If? Si s 5 se? u? s s 3 s a 

&s if? ss ifi s a s a 5 ss u? a j- s « 

715 720 

AsS iS ?S C r CTT c CG S*? ATG TGC ATC GCA GTT GGC ATG GTC ACA CTG 
Asn Thr Ser Leu Ser Val Met Cys lie Ala Val Gly Met 5al Thr 2S 
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1 824 
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TAC CTA GGA GTC ATG GTT CAG GCA GAT TCG GGA - TGT GTA ATC A AC TGG 2256 
Tyr Leu Gly Val Met Val Gin Ala Asp Ser Gly Cys Val He Asn Trp 
740 .745 750 

AAA GGC AGA GAA CTT AAA TGT GGA AGC GGC ATT TTT GTC ACT AAT GAA 2304 
Lys Gly Arg Glu Leu Lys Cys Gly Ser Gly He Phe Val Thr Asn Glu 
755 * 760 765 

GTT CAC ACT TGG ACA GAG CAA TAC AAA TTC CAG GCT GAC TCC CCC AAG 2352 
Val His Thr Trp Thr Glu Gin Tyr Lys Phe Gin Ala Asp Ser Pro Lys 
770 " 775 " 780 

AGA CTA TCA GCA GCC ATT GGG AAG GCA TGG GAG GAG GGT GTG 2394 
Arg Leu Ser Ala Ala He Gly Lys Ala Trp Glu Glu Gly Val 
785 790 795 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2145 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 
(V) FRAGMENT TYPE: 

_ (vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 1 . . .2145 
(D) OTHER INFORMATION:. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

AAG GTC TTA AAA GGC TTC AAG AAG GAG. ATC TCA AAC ATG CTG AGC ATT 48 
Lys Val Leu Lys Gly Phe Lys Lys Glu He Ser Asn Met Leu Ser lie 
1 5 10 15 

ATC AAC AAA CGG AAA AAG ACA TCG CTC TGT CTC ATG ATG ATG TTA CCA 96 
lie Asn Lys Arg Lys Lys Thr Ser Leu Cys Leu Met Met Met Leu Pro 
20 ~ . 25 30 

GCA ACA CTT GCT TTC CAC TTA ACT TCA CGA GAT GGA GAG CCG CGC ATG 1 44 

Ala Thr Leu Ala Phe His Leu Thr Ser Arg Asp Gly Glu Pro Arg Met 
35 40 45 

ATT GTG GGG AAG AAT GAA AGA GGA AAA TCC CTA CTT TTC AAG ACA GCC 1 92 

He Val Gly Lys Asn Glu Arg Gly Lys Ser Leu Leu Phe Lys Thr Ala 
50 55 60 

TCT GGA ATC AAC ATG TGC ACA CTC ATA GCT ATG GAT CTG GGA GAG ATG 2 40 

Ser Gly He Asn Met Cys Thr Leu He Ala Met Asp Leu Gly Glu Met 
65 .70 75 80 
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TGT GAT GAC ACG GTC ACT TAC AAA TGr rrr *mm *~ 

Cys Asp ASP Thr VU Thr Tyr 1% ?y°s |S gj S2 Jg g* «f«« 

• 90 9.5. 
CCT GAA GAC ATT GAC TGC TGG TGC AAC CTT ara -vrr »o» 
Pro Glu Asp II. Asp Cys Trp «£ S-S S Sg g &f '£ 

105 110 

5 Iff & £ Asn a s If? « « 3 ss g g ss s 

S3 a a s s s& js a? s if? s % & g g-g. 



288 



336 



384 



432 



S ffi S S 8j If? S S Arg 3 55 8- Lys 

s s s s g as s if? s j£ js a s a 

165 |7 0 



GCC CAT TAC 
Ala His Tyr 



CTA TTA ATG 
Leu Leu Met 
195 

GGA AAC AGA 
Gly Asn Arg 
210 



T?£ r?° £P J CC TTG ACC CAG A AA GTG GTT ATT 
lie Gly Thr Ser Leu Thr Gin Lys Val Val lie 

185 190 

T C In'w T ? A S C CCA TCC ATG AC A ATG AGA TGT GTA 
Leu Val Thr Pro Ser Met Thr Met Arg Cys Val 
200 205 

GAT TTT GTG GAA GGC CTA TCG GGA GCT ACG TGG 
Asp Phe Val Glu Gly Leu Ser Gly ?la JS 
215 220 



GTA GAG 
Val Glu 
160 

TTT CTT 
Phe Leu 
175 • 

TTT ATA 
Phe He 



GGA GTA 
Gly Val 



GTT GAC 
Val Asp 



GTG GTG CTC GAG CAC 
225 G1U HiS 

. CCC- ACG CTG-GAC ATA 
Pro -Thr Leu Asp He 
245 

ACC CTA AGG AAG CTA 
Thr Leu Arg Lys Leu 
260 

GAC TCA AGA TGT CCC 
Asp Ser Arg Cys Pro 
275 

GAC CAG AAC TAC GTG 
P 290 

AAC GGT TGT GGT TTG 
Asn Gly cys Gly Leu 



GGT GGG TGT GTG ACT ACC ATG GCT AAG AAC AAC 
Gly Gly cys Val Thr Thr Met Ala Lys £Js 

• 235 240 

GAG CTT CAG AAG ACC GAG GCC ACC CAA CTG err 
Glu Leu Gin Lys Thr Glu Ala Thr Gin 22 All 
250 255 

TGC ATT GAG GGA AAA ATT ACC AAC ATA ACA ACC 
Cys lie Glu Gly Lys lie Thr Asn lie Thr Th? 
265 270 

S£ C ^ A 2? G GAA GCG ATT TTA CC ? GAG GAG CAG 
Thr Gin Gly Glu Ala lie Leu Pro Glu Glu Gin 
too 285 

TGT AAG CAT ACA TAC GTG GAC AGA GGC TGG GGA 
Cys Lys His Thr Tyr Val Asp Arg Gly Trp Gly 

300 

TTT GGC AAG GGA AGC TTG GTG ACA TGC GCG AAA 
Phe Gly Lys Gly Ser Leu Val Thr Cys Ala Lys 

315 320 



480 



528 



576 



624 



672 



720 



7 68 



816 



864 



912 



960 
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.TTT CAA TGT'TTA GAA TCA ATA GAG GGA AAA" GTG GTG CAA CAT GAG AAC 1008 

Phe Gin Cys Leu Glu Ser lie Glu Gly Lys Val Val Gin His Glu Asn • 
325 330 335 

CTC AAA TAC ACC GTC ATC ATC AC A GTG CAC ACA GGA GAC CAA CAC CAG 1056 
Leu Lys Tyr Thr Val He He Thr Val His Thr Gly Asp Gin His Gin 
340 345 350 

GTG GGA AAT GAA ACG CAG GGA GTC ACG GCT GAG ATA ACA CCC CAG GCA 1104 
Val Gly Asn Glu Thr Gin Gly Val Thr Ala Glu He Thr Pro Gin Ala 
355 ' 360. 365 

TCA ACC GCT GAA GCC ATT TTA CCT GAA TAT GGA ACC CTC. GGG CTA GAA 1152 
Ser Thr Ala Glu Ala He Leu Pro Glu Tyr Gly Thr Leu Gly Leu Glu 
370 375 380 

TGC TCA CCA CGG ACA GGT TTG . GAT TTC AAT GAA ATG ATC TCA TTG ACA 12.00 
Cys Ser Pro Arg Thr Gly Leu Asp Phe Asn Glu Met He Ser Leu Thr 
385 390 395 400 

ATG AAG AAC AAA GCA TGG ATG GTA CAT AGA CAA TGG TTC TTT GAC TTA 1248 
Met Lys Asn Lys Ala Trp Met Val His Arg Gin Trp Phe Phe Asp Leu 
. 405 " 410 415 

CCC CTA CCA TGG ACA TCA GGA GCT ACA GCA GAA ACA CCA ACT TGG AAC . 1296 
Pro Leu Pro Trp Thr Ser Gly Ala Thr Ala Glu Thr Pro Thr Trp Asn 
420 " 425 430 

AGG AAA GAG CTT CTT GTG ACA TTT AAA AAT GCA CAT GCA AAA AAG CAA 1344 
Arg Lys Glu Leu Leu Val Thr Phe Lys Asn Ala His Ala Lys Lys Gin 
435 440 445 

GAA GTA GTT GTT CTT GGA TCA CAA GAG GGA GCA ATG CAT ACA GCA CTG 1392 
Glu Val Val Val Leu Gly Ser Gin Glu Gly Ala Met His Thr Ala Leu 
450 455 460 

ACA GGA GCT ACA GAG ATC CAA ACC TCA GGA GGC ACA AGT ATC TTT GCG 1440 
Thr Gly Ala Thr Glu He Gin Thr Ser Gly Gly Thr Ser He Phe Ala 
465 * 470 ' 475 480 

GGG CAC TTA AAA TGT AGA CTC AAG ATG GAC AAA TTG GAA CTC AAA GGG 1 488 
Gly His Leu Lys Cys Arg Leu Lys Met Asp Lys Leu Glu Leu Lys Gly 
485 490 495 

ATG AGC TAT GCA ATG TGC TTG GGT AGC TTT GTG TTG AAG AAA GAA GTC 1536 
Met Ser Tyr Ala Met Cys Leu Gly Ser Phe Val Leu Lys Lys Glu Val 
500 505 510 

TCC GAA ACG CAG CAT GGG ACA ATA CTC ATT AAG GTT GAG TAC AAA GGG 1 584 
Ser Glu Thr Gin His Gly Thr He Leu lie Lys Val Glu Tyr Lys Gly 
515 520 525 

AAA GAT GCA CCC TGC AAG ATT CCT TTC TCC ACG GAG GAT GGA CAA GGA 1632 
Lys Asp Ala Pro Cys Lys He Pro Phe Ser Thr Glu Asp Gly Gin Gly 
530 535 540 

AAA GCT CAC AAT GGC AGA CTG ATC ACA GCC AAT CCA GTG GTG ACC AAG 1 680 
Lys Ala His Asn Gly Arg Leu He Thr Ala Asn Pro Val Val Thr Lys 
545 550 555 560 
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AAG GAG GAG CCT .GTC AAC ATT GAG- GCT GAA GCT CCT TTT GGA GAA AGT " 1-7 3 A 

Lys Glu Glu Pro Val Asn lie Glu Ala Glu Pro Pro Phe Gly Glu Ser 
565 570 575 

AAC ATA GTA ATT GGA ATT GGA GAC AAA GCC CTG AAA ATC AAC TGg' TAC 1776 
Asn lie Val lie Gly lie Gly Asp Lys Ala Leu Lys lie Asn Trp Tyr 
580 585 590 

AAG AAG GGA AGC TCG-ATT GGG AAG ATG TTC GAG GCT ACT GCC AGA GGT 1824 
Lys Lys Gly Ser Ser lie Gly Lys Met Phe Glu Ala Thr Ala Arg Gly 
595 600 605 

GCA AGG CGC ATG GCC ATC TTG GGA GAC ACA GCC TGG GAC TTT GGA TCA 1R7? 
Ala Arg Arg Met Ala lie Leu Gly Asp Thr Ala Trp Asp Phe Gly Ser 
610 615 620 

5? rrl if? jF TG " aat £CA TTA GGG AAA ATG GTC CAG CAA ATA TTT 1920 

Val Gly Gly Val Leu Asn Ser Leu Gly Lys Met Val His Gin He Phe 
625 630 635 640 

GGG AGT GCT TAC ACA GCC CTA TTT GGT GGA GTC TCC TGG ATG ATG AAA 1968 
Gly Ser Ala Tyr Thr Ala Leu Phe Gly Gly Val Ser Tr£ Met Me? £J5 
645 650 ' 655 

ill Gli T?o r?3 r CTC T A ££ C TGG ATA GGG TTG AAC TCA AAA AAT 2016 

He Gly He Gly Val Leu Leu Thr Trp He Gly Leu Asn Ser Lys Asn 

660 665 670 

ihr Ser £J? SJ ll T c CA ^ GC A ? C GCG A ? A GGA ATC ATT ACA CTC TAT 2064 
Thr Ser Met Ser Phe Ser Cys He Ala He Gly He He Thr Leu Tyr 

675 . 680 685 

CTG GGA GCC GTG GTG CAA GCT GAC ATG GGG TGT GTC ATA AAC TGG AAA 2112 
Leu Gly Ala Val Val Gin Ala Asp Met Gly Cys Val lie Asn Trp Lys 
t>90 695 700 

GGC AAA GAA CTC AAA TGT GGA AGT GGA ATT TTC 
Gly Lys Glu Leu Lys Cys Gly Ser Gly He Phe 
705 710 715 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2175 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
■(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 

(A) NAME /KEY : Coding Sequence 

(B) LOCATION: 1 . . .2175 
(D) OTHER INFORMATION: 
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(xi) SEQUENCE DESCRIPTION :. SEQ ID NO: 12.6: 

ATT CTG AAG AG A' TGG GGA CAG TTG AAG AAA AAT AAG GCC ATC AGG ATA 48 
■He Leu Lys Arg Trp Gly Gin Leu Lys Lys Asn Lys Ala He Arg He. 
1 5 10 15 

CTG ATT GGA TTC AGG AAG GAG ATA GGC CGC ATG CTG AAC ATC TTG AAC 96 
Leu He Gly Phe Arg Lys Glu lie Gly Arg Met Leu Asn He Leu Asn 
20 ~ 25 30 

GGG AG A AAA AGG TCA ACG ATA ACA TTG CTG TGC TTG ATT CCC ACC GTA 1 44 

Gly Arg Lys Arg Ser Thr lie Thr Leu Leu Cys Leu He Pro Thr Val 
35 40. 45 

ATG GCG TTT CAC TTG TCA ACA AG A GAT GGC GAA CCC CTC ATG ATA GTG 192 
Met Ala Phe His Leu Ser Thr Arg Asp Gly Glu Pro Leu Met He Val 
50 55 .60 

GCA AAA CAT GAA AGG GGG AGA CCT CTC TTG TTT AAG ACA ACA GAG GGG 240 
Ala Lys His Glu Arg Gly Arg Pro Leu Leu Phe Lys Thr Thr Glu Gly 
65 70 75 80 

ATC AAC AAA TGC ACT CTC ATT GCC ATG GAC TTG GGT GAA ATG TGT GAG 288 
He Asn Lys Cys Thr Leu He Ala Met Asp Leu Gly Glu Met Cys Glu 

85 ... 90 • 95 

GAC ACT GTC ACG TAT AAA TGC CCC TTA CTG GTC AAT ACC GAA CCT GAA 3 36 

Asp Thr Val Thr Tyr Lys Cys Pro Leu Leu Val Asn Thr "Glu Pro Glu 
100 105 110 

GAC ATT GAT TGC TGG TGC AAT CTC ACG TCT ACC TGG GTC ACA TAT GGG 384 
Asp He Asp Cys Trp Cys Asn Leu Thr Ser Thr Trp Val Thr Tyr Gly 
115 120 125 

ACA TAC ACC CAG AGC GGA GAA CGG AGA CGA GAG "AAG CGC TCA GTA GCT 4 32 

Thr Tyr Thr Gin Ser Gly Glu Arg Arg Arg Glu Lys Arg Ser Val Ala 
130 135 140 

TTA ACA CCA CAT- TCA GGA ATG GGA TTG -GAA ACA AGA GCT GAG ACA TGG 480 
Leu Thr Pro His Ser Gly Met Gly Leu Glu Thr Arg Ala Glu Thr Trp 
145 150 155 ■ 160 

ATG TCA TCG GAA GGG GCT TGG AAG CAT GCT CAG AGA GTA GAG AGC TGG 528 
Met Ser Ser Glu Gly Ala Trp Lys His Ala Gin Arg Val Glu Ser Trp 
165 170 . 175 

ATA CTC AGA AAC CCA GGA TTC GCG CTC TTG GCA GGA TTT ATG GCT TAT 5 76 

He Leu Arg Asn Pro Gly Phe Ala Leu Leu Ala Gly Phe Met Ala Tyr 
180 185 190 

ATG ATT GGG CAA ACA GGA ATC CAG CGA ACT GTC TTC TTT GTC CTA ATG 624 
Met He Gly Gin Thr Gly lie Gin Arg Thr Val Phe Phe Val Leu Met 
195 200 205 

ATG CTG GTC GCC CCA TCC TAC GGA ATG CGA TGC GTA GGA GTA GGA AAC 672 
Met Leu Val Ala Pro Ser Tyr Gly Met Arg Cys Val Gly Val Gly Asn 
210 215 220 
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AGA GAC TTT GTG GAA GGA GTC TCA GGT GGA GCA TGG GTC GAT CTG rrr 
Arg Asp. Phe Val Glu Gly Val Ser Gly Gly Ala Trp SS Asp 22 Si 
3 230 . . 235 240 

CTA GAA CAT GGA GGA TGC GTC ACA ACC ATG GCC CAG GGA AAA CCA ACT 7fifi 
Leu Glu His Gly Gly Cys Val Thr Thr Met Ala Gin Gly Lys Pro ihr 
^45 250 255 

TTG GAT TTT GAA CTG ACT AAG ACA ACA GCC AAG GAA GTG GCT CTG TTA 816 
Leu Asp Phe Glu Leu Thr Lys Thr Thr Ala Lys Glu Val Ala Leu Leu 
260 265 270 

i?S Thr Tvr 5S x?I r? A * CC £ CA t? A TCA AAC ATA ACC ACG gca aga 864 
Arg Thr Tyr Cys He Glu Ala Ser He Ser Asn He Thr Thr Ala Thr 

275 280 285 

AGA TGT CCA ACG CAA GGA GAG CCT TAT CTA AAA GAG GAA CAA GAC CAA 91? 
Arg Cys Pro Thr Gin Gly Glu Pro Tyr Leu Lys Glu Glu Gin Asp Gin 
^ yu 295 300 

l AC xT T l GC CGG AGA GAT GTG GTA GAC AGA GGG TGG GGC AAT GGC 960 
Gin Tyr He Cys Arg Arg Asp Val Val Asp Arg Gly Trp Gl£ Asn Gly 

305 310 315 320 

TGT GGC TTG TTT GGA AAA GGA GGA GTT GTG ACA TGT GCG AAG TTT TCA 1 OOR 
Cys Gly Leu Phe Gly Lys Gly Gly Val Val Thr Cys Ala Lys Phe Ser 
325 330 335 

TGT TCG GCG AAG ATA ACA GGC AAT TTG GTC CAA ATT GAG AAC CTT GAA m« 
Cys Ser Gly Lys He Thr Gly Asn Leu Val Gin lie Glu j£S ill §£ 
J40 345 350 

TAC ACA GTG GTT GTA ACA GTC CAC AAT GGA. GAC ACC CAT GCA GTA GGA 
Tyr Thr Val Val Val Thr Val His Asn Gly Asp Thr Hxs Ala Val fly 
355 360 365 

A^n ft C ££ A l CC AAT CAT GGA GTT ACA GCC ACG ATA ACT CCC AGG TCA 1 152 
Asn Asp Thr Ser Asn His Gly Val Thr Ala Thr He Thr Pro Arg Ser 
J/u 375 380 

52 S TG ^ AA AAA TTG CCG GAC TAT GGA GAA CTA ACA CTC GAT 1200 

385 3^0 ASP ^ 3II GlU LSU Thr LeU As g 

TGT GAA CCC AGG TCT GGA ATT GAC TTT AAT GAG ATG ATT CTG ATG AAA 1248 
Cys Glu Pro Arg .Ser Gly He Asp Phe Asn Glu Met He Leu Met Lys 
405 410 415 

ATG AAA AAG AAA ACA TGG CTT GTG CAT AAG CAA TGG TTT TTG GAT CTA 1296 
Met Lys Lys Lys Thr Trp Leu Val His Lys Gin Trp Phe Leu Asp Leu 
420 425 430 

Si S A £ CA l GG ACA GCA GGA GCA GAC ACA TCA CAG GTT CAC TGG AAT 1344 
Pro Leu Pro Trp Thr Ala Gly Ala Asp Thr Ser Glu Val His Trp Asn 
435 440 445 

J AG AAA GAG AGA ATG GTG ACA TTT AAG GTT CCT CAT GCC AAG AGA CAG 1392 
Tyr Lys Glu Arg Met Val Thr Phe Lys Val Pro His Ala Lys Arg Gin 

455 . 460 



1 104 
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■GAT GTG ACA'GTG CTG GGA TCT CAG GAA GGA GCC ATG CAT TCT GCC CTC 1440 
Asp Val Thr Val Leu Gly Ser Gin Glu Gly .Ala Met His Ser Ala Leu 
465 470 475 480 

GCT GGA GCC AC A GAA GTG GAC TCC GGT GAT GGA AAT CAC ATG TTT GCA 1 - 488 
Ala Gly Ala Thr Glu Val Asp Ser Gly Asp Gly Asn. His Met Phe Ala 
485 490 495 

GGA CAT CTC AAG TGC AAA GTC CGT ATG GAG AAA TTG AGA ATC AAG GGA 1536 
Gly His Leu Lys Cys Lys Val Arg Met Glu Lys Leu Arg lie Lys Gly 
500 505 510 

ATG TCA TAC ACG ATG TGT TCA GGA AAG TTC TCA ATT GAC AAA GAG ATG 1584 
Met Ser Tyr Thr Met Cys Ser Gly Lys Phe Ser He Asp Lys Glu Met 
515 520 525 

GCA GAA ACA CAG CAT GGG ACA ACA GTG. GTG AAA GTC AAG TAT GAA GGT 1632 
Ala Glu Thr Gin "His Gly Thr. Thr Val Val Lys Val Lys Tyr Glu Gly 
530 . . 535 540 

GCT GGA GCT CCG TGT AAA GTC CCC ATA GAG ATA AGA GAT GTG AAC AAG 1 680 
Ala Gly Ala Pro Cys Lys Val Pro He Glu He Arg Asp Val Asn Lys 
545 550 555 560 

AAA AAA . GTG GTT GGG CGT ATC ATC TCA TCC ACC CCT TTG GCT GAG AAT 1728 
Lys Lys Val Val Gly Arg lie He Ser Ser Thr Pro -Leu Ala Glu Asn 
-565 570 575 

ACC AAC AGT GCA ACC AAC ATA GAG TTA GAA CCC CCC TTT GGG GAC AGC 1776 
Thr Asn Ser Ala Thr Asn He Glu Leu Glu Pro Pro Phe Gly Asp Ser 

580 585 590 " 

TAC ATA GTG ATA GGT GTT GGA AAC AGT GCA TTA ACA CTC CAT TGG TTC 1824 
Tyr lie Val He Gly Val Gly Asn Ser Ala Leu Thr Leu His Trp Phe 
595 600 605 

AGG AAA GGG AGT TCC ATT GGC AAG ATG TTT GAG TCC ACA TAC AGA GGT 1872 
Arg Lys Gly Ser- Ser He Gly Lys Met Phe Glu Ser Thr Tyr Arg Gly 
610 615 620 

GCA AAA CGA ATG GCC ATT CTA GGT GAA ACA GCT TGG GAT TTT GGT TCC 1 920 
Ala Lys Arg Met Ala He Leu Gly Glu Thr Ala Trp Asp Phe Gly Ser 
625 630 635 " 640 

GTT GGT GGA CTG TTC ACA TCA TTG GGA AAG GCT GTG CAC CAG GTT TTT 1 968 
Val Gly Gly Leu Phe Thr Ser Leu Gly Lys Ala Val His Gin Val Phe 
645 650 655 

GGA AGT GTG TAT ACA ACC ATG TTT GGA GGA GTC TCA TGG ATG ATT AGA 2 016 
Gly Ser Val Tyr Thr Thr Met Phe Gly Gly Val Ser Trp Met He Arg 
660 665 670 

ATC CTA ATT GGG TTC CTA GTG TTG TGG ATT GGC ACG AAC TCA AGG AAC 2064 
He Leu He Gly Phe Leu Val Leu Trp He Gly Thr Asn Ser Arg Asn 
675 680 685 

ACT TCA ATG GCT ATG ACG TGC ATA GCT GTT GGA GGA ATC ACT CTG TTT 2112 
Thr Ser Met Ala Met Thr Cys He Ala Val Gly Gly He Thr Leu Phe 
690 695 700 
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CTG GGC TTC. ACA GTT CAA GCA GAG -ATG GGT TGT GTG GTG TCA TGG AGT ?1fif> 

Leu Gly Phe Thr Val Gin Ala Glu Met Gly'Cys Val Val Ser Trp Ser 
705 710 715 720 

GGG AAA GAA TTG AGG 
Gly Lys Glu Leu Arg 
725 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 base' pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : . linear 

- (ii) MOLECULE TYPE: cDNA 
. (iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO " « ■ 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
CACTACCGCA AGGTAGAGAG CTCGGCATT~ CCTCTTGGTG.. 40 
(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS:' 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( iii ) HYPOTHETICAL : NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 128: 
GTGATGGCGT TCCATCTCTC GAGCCGTAAC GGAGAACCAC 40 
(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid - : 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129 
GCCCTGGCGT TCCATCTCTC GAGCCGAGGG GGhGAGCCGC 
(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

• (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 
ACACTTGCTT TCCACCTCTC GAGCCGAGAT GGAGAGCCGC 
(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
'(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(.iv-)-ANTISENSE-:-.-NO-. ._. .. - . ..." 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 3 1 
GTAATGGCGT TTCACCTCTC GAGCAGAGAT GGCGAACCCC 
(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pair's 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 
CCTATCCTTA CTTAAGATCT TCGTGGAGTG ACAGAC 
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(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA . 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 
GGATAGGAAT GAATTCTAGA AGCACCTCAC TGTCTG 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
CCGCAGAGAT CGTTTTCCTG CCTGCATGAT TCC 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ' ID NO: 135: 
CCGATCCTAA TTTAAGATCT TTGTGCAGGG AAAGCC 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 
CCTATCCCAA CTTGAGATCT TTATGAAGAT ACAGTA 

(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
. (D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
CCTAACCGTG CTTGAGATCT TTGTGAAGTT ACCGAC 
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WHAT IS CLAIMED IS: 

1. A guadravalent vaccine providing, immunity 
against all four serotypes of dengue virus comprising a 
DEN -2 PDK-53- infectious clone-derived virus. 

2. A guadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 
chimeric DEN- 2/1 virus. 

3. A guadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 
chimeric DEN- 2/3 virus. 

4. A guadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising a 
chimeric DEN-2/4 virus. 

5 . A guadravalent vaccine providing immunity 
against all four serotypes of dengue virus comprising DEN- 

"2 PDK-53 infectious clone -derived and chimeric . DEN-2 /l , 
DEN-2/3, and DEN-2/4 viruses. 

6. a method of immunization in which a desired 
immune response is produced against all four serotypes of 
dengue virus comprising the step of administering to a 
subject a guadravalent vaccine comprising DEN-2 PDK-53 
infectious clone-derived and chimeric DEN-2/1, DEN-2/3, 
and DEN-2/4 viruses. 

7. A composition of matter comprising a full 
genome -length infectious cDNA clone for a DEN-2 virus, 
strain 16581. 

8. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN-2 virus of a 
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strain characterized as replicating to. high .titer in cell 
culture'. 

9. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN - 2 virus, 
strain 16681, having the identifying characteristics of 
ATCC 69826. 

10. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN -2 virus, 

• strain 16681, attenuated derivative, PDK-53. 

11. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN- 2 virus 
attenuated derivative, characterized as replicating to 
high titer in cell culture. . ' 

12. A composition of matter comprising a full 
genome-length infectious cDNA clone for a DEN - 2 virus, 
strain 16681, attenuated derivative, PDK-53, having trie 
identifying characteristics of ATCC 69825.. 

13. A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN- 2/1 
virus, wherein said virus is characterized as the 
expressing prM and E genes of a DEN- 1 attenuated virus in 
the context of the nonstructural genes of the DEN - 2 PDK-53 
virus . 

14. The composition of matter of Claim 13, wherein 
said DEN-i attenuated virus is DEN-1 PDK-13. 

15. A composition of matter comprising a full 
genome-length infectious cDNA clone of a chimeric DEN- 2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN-i attenuated virus. 
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16.. A composition of matter comprising a full 
genome- length infectious cDNA clone of a chimeric DEN- 2/3 
•virus, wherein said virus is characterized- as expressing 
the prM and E genes of a- DEN- 3 attenuated- virus in the 
• 5; context. of the nonstructural genes' of the DEN- 2 PDK-53 
virus . 

17. The composition of matter of Claim 16, wherein 
said ..DEN- 3 attenuated virus is DEN- 3 PGMK3 0/FRhL- 3 . 

18. A composition of matter comprising a full 

10 genome-length infectious cDNA clone of a chimeric DEN - 2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN - 3 attenuated virus.' 

19. A composition of matter comprising a full 
genome- length infectious cDNA clone of a chimeric DEN- 2/4 

15 virus, wherein said virus is characterized as expressing 
the prM and E" genes of a DEN -4 attenuated virus in the 
context of the nonstructural genes of the DEN-2 PDK-53 
virus . 

20. The composition of matter of Claim 19, wherein 
20 said DEN- 4 attenuated virus is DEN -4 PDK-48. 

21. A composition of matter comprising a full 
genome- length infectious cDNA clone of a chimeric DEN-2 
virus, wherein said virus is characterized as expressing 
the antigenicity of a DEN -4 attenuated virus. 

2 5 22. A genetic construct comprising a . DNA sequence 

operably encoding the. polyprotein of DEN-2 virus, strain 
16681 . 

23. The genetic construct of Claim 22, wherein, said 
polyprotein is the polyprotein encoded by the nucleotide 

3 0 sequence of SEQ ID N0:1. 
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24. .K genetic construct comprising a DNA sequence 
• • operably encoding at least one protein of DEN- 2 virus, 

strain . 16681 . 

25. The genetic construct of Claim 24, wherein said. 
5 protein is a protein encoded by the nucleotide sequence of 

SEQ ID NO: 1. 

26. A genetic construct comprising a DNA sequence 
operably encoding the polyprotein of DEN -2 virus, strain 
16681, attenuated derivative, PDK-53. 

10 27. The genetic construct of Claim 26, wherein said 

polyprotein. is the polyprotein encoded by the nucleotide 
sequence of SEQ ID NO: 2. 

28. A genetic construct comprising a DNA sequence 
operably encoding at least one protein of DEN- 2 virus, 

15 strain 16681, attenuated derivative, PDK-53 . 

29. The genetic construct of Claim 28, wherein said 
protein is a protein encoded by the nucleotide sequence of 
SEQ ID NO: 2. 

30. A genetic construct comprising a DNA sequence 

2 0 operably encoding at least one structural protein of DEN-1 

PDK-13. . 

31. The genetic construct of Claim 30, wherein said 
structural protein is a structural protein encoded by the 
nucleotide sequence of SEQ ID NO: 124. 

25 32. A genetic construct comprising a DNA sequence 

operably encoding at least one structural protein of DEN- 3 
PGMK3 0/FRhL- 3 . 

33. The genetic construct of Claim 32, wherein said 
structural protein is a structural protein encoded by the 

3 0 nucleotide sequence of SEQ ID NO: 125. 
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34. A genetic construct comprising a DNA sequence 

. operably encoding at least one structural protein of DEN -4 
PDK-48. 

35. The genetic construct of Claim 34, wherein said 
structural protein is a. structural protein encoded by the 
nucleotide sequence of SEQ ID NO: 126. 

36. A host cell comprising the genetic construct of 
any of Claims 22-35. 
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